
Cloud Engineering Archives - Software Engineering Daily
Episodes about building and scaling large software projects
Latest episodes

Feb 13, 2019 • 56min
Replicated: On-Prem Deployments with Grant Miller
Cloud computing has been popular for less than twenty years. Large software companies have existed for much longer. If your company was started before the cloud became popular, you probably have a large, data center on your companies premises. The shorthand term for this software environment is “on-prem”.
Deploying software to your own on-prem servers can be significantly different than deploying to remote servers in the cloud. In the cloud, servers and resources are more standardized. It is often easier to find documentation and best practices for how to use cloud services.
Many of the software vendors who got started in the last decade created their software in the cloud. For example, Readme.io makes it easy for companies to create hosted documentation. Their early customers were startups and other cloud-native companies. All of those companies were happy to consume the software in the cloud. As time went on, Readme found that other customers wanted to use the Readme product as a self-hosted, on-prem service. Readme needed to figure out how to deploy their software easily to the “on-prem” environment.
It turns out that this is a common problem. Software vendors who want to sell to on-prem enterprises must have a defined strategy for making those deployments to on-prem infrastructure–and those deployments are not always easy to configure.
Replicated is a company that allows cloud-based software to easily deploy to on-prem infrastructure. Grant Miller is the founder of Replicated and he joins the show to discuss on-prem, cloud, and the changing adoption patterns of enterprise software companies.
Show Notes
Medium – Introducing Replicated, A better way to deploy SaaS on-premise
How ReadMe Went From SaaS To On-Premises In Less Than One Week
The post Replicated: On-Prem Deployments with Grant Miller appeared first on Software Engineering Daily.

Feb 8, 2019 • 53min
Knative: Serverless Workloads with Ville Aikas
Infrastructure software is having a renaissance.
Cloud providers offer a wide range of deployment tools, including virtual machines, managed Kubernetes clusters, standalone container instances, and serverless functions. Kubernetes has standardized the container orchestration layer and created a thriving community. The Kubernetes community gives the cloud providers a neutral ground to collaborate on projects that benefit everyone.
The two forces of cloud providers and Kubernetes have led to massive improvements in software quality and development practices over the last few years. But one downside of the current ecosystem is that many more developers learn how to operate a Kubernetes cluster than perhaps is necessary. “Serverless” tools are at a higher level than Kubernetes, and can improve developer productivity–but a risk of using a serverless tool is the potential for lock-in, and a lack of portability.
Knative is an open-source serverless platform from Google built on top of Kubernetes. Ville Aikas is a senior staff engineer at Google who has worked at the company for eleven years. With his experience, Ville brings a rare perspective to the subjects of Kubernetes, serverless, and the infrastructure lessons of Google. Ville joins the show to discuss Knative, the motivation for building it, and the future of “serverless” infrastructure.
The post Knative: Serverless Workloads with Ville Aikas appeared first on Software Engineering Daily.

Feb 7, 2019 • 44min
VMware Kubernetes Strategy with Brad Meiseles
Virtualization software allows companies to get better utilization from their physical servers. A single physical host can manage multiple virtual machines using a hypervisor. VMware brought virtualization software to market, creating popular tools for allowing enterprises to deploy virtual machines throughout their organization.
Containers provide another improvement to server utilization. A virtual machine can be broken up into containers, allowing multiple services to run within a single VM. Containers proliferated after the popularization of Docker, and the Kubernetes open source container orchestration system grew to be the most common way of managing the large numbers of containers that were running throughout an organization.
As Kubernetes has risen to prominence, software infrastructure companies have developed Kubernetes services to allow enterprises to use Kubernetes more easily. VMware’s PKS is one example of a managed Kubernetes service.
Brad Meiseles is a senior director of engineering at VMware with more than nine years of experience with the company. He joins the show to discuss virtualization, Kubernetes, containers, and the strategy of a large infrastructure provider like VMware.
The post VMware Kubernetes Strategy with Brad Meiseles appeared first on Software Engineering Daily.

Feb 4, 2019 • 50min
Scaling HashiCorp with Armon Dadgar and Mitchell Hashimoto
HashiCorp was founded seven years ago with the goal of building infrastructure tools for automating cloud workflows such as provisioning, secret management, and service discovery. Hashicorp’s thesis was that operating cloud infrastructure was too hard: there was a need for new tools to serve application developers.
Hashicorp founders Mitchell Hashimoto and Armon Dadgar began releasing open source tools to fulfill their vision of better automation. Terraform, Vagrant, Consul, and other tools created by Hashicorp gained popularity, and Hashicorp began iterating on their business model. Today, Hashicorp makes money by offering enterprise features and support to enterprises such as Pinterest, Adobe, and Cruise Automation.
Over the last seven years, enterprise software infrastructure has changed rapidly. First, enterprises moved from script-based infrastructure automation to container orchestration frameworks. Then, the container orchestration world consolidated around Kubernetes.
Today, large enterprises are rapidly adopting Kubernetes with a mix of public cloud and on-prem vendors. At the same time, these enterprises are also becoming more willing to consume proprietary tools from the public cloud providers.
Hashicorp has benefitted from all of this change. Their different tools fit into a variety of workflows, and are not closely coupled with any particular cloud provider or platform solution.
Armon and Mitchell join today’s show to discuss the business model and the product philosophy of HashiCorp. We also touch on service mesh, zero trust networking, and their lessons from the container orchestration wars.
The post Scaling HashiCorp with Armon Dadgar and Mitchell Hashimoto appeared first on Software Engineering Daily.

Jan 21, 2019 • 46min
Prometheus Scalability with Bryan Boreham
Prometheus is an open source monitoring system and time series database. Prometheus includes a multi-dimensional data model, a query language called PromQL, and a pull model for gathering metrics from your different services. Prometheus is widely used by large distributed systems deployments such as Kubernetes and Cloud Foundry.
Prometheus gathers metrics from your services by periodically scraping those services. Those metrics get gathered, compressed, and stored onto disk for querying. But Prometheus is designed to store all of its records on one host in one set of files–which limits the scalability and availability of those metrics.
Cortex is an open source project built to scale Prometheus. Cortex effectively shards Prometheus by parallelizing the “ingestion” and storage of Prometheus metrics. Cortex can take metrics from multiple Prometheus instances and store them across a distributed NoSQL database like DynamoDB, BigTable, or Cassandra.
Bryan Boreham is an engineer at Weaveworks, where he works on deployment, observability, and monitoring tools for containers and microservices. He wrote much of the code for Cortex, and we met up at KubeCon North America to talk about the motivation for creating Cortex, the broader landscape of Kubernetes monitoring, and other approaches to scaling Prometheus.
The post Prometheus Scalability with Bryan Boreham appeared first on Software Engineering Daily.

Jan 18, 2019 • 1h 6min
Spot Instances with Amiram Shachar
When a developer provisions a cloud server, that server is called an “instance”. These instances can be used for running whatever workload a developer has, whether it is a web application, a database, or a set of containers.
The cloud is cheap to get started on. New applications with few users can often be hosted on infrastructure that is less than $10 per month. But as an application grows in popularity, there is more demand for CPUs and storage. A company will start to buy more and more servers to scale up to the requirements of their growing user base. The costs of running infrastructure in the cloud will increase, and the company will start to look for ways to save money.
One common method of saving money is to buy “spot instances”. A spot instance is an instance that is cheaper than “reserved instances” or “on-demand” instances. The reason that there are different instance types is because a giant cloud provider has a highly variable amount of work that is being demanded from that cloud provider.
If you are in charge of AWS, you have to make sure that at any given time, you can give server resources to anyone that asks for it. Your data centers need to have physical machines that are ready to go at any time. This means that much of the time, you have server resources that are going unused.
If you are a cloud provider, how can you get people to use your compute resources? You can make them cheaper. So a user can come along and buy your compute at the discounted “spot” price.
But this presents a problem for the cloud provider. If you start to give away your compute at cheaper prices, and then the overall demand for your cloud resources go up once again, you are going to miss out on profits. As the cloud provider, you need to kick people off of your spot instances, so that you can take those same instances and sell them to people at the higher market prices.
And this presents a problem for the user. If you buy a cheap spot instance, that instance is only available until the cloud provider decides to kick you off. You have a tradeoff between cost and availability of your instances. Because of this, spot instances are typically used only for workloads that are not mission critical–workloads that can afford to fail.
Spotinst is a company that allows developers to deploy their workloads reliably onto spot instances. Spotinst works by detecting when a spot instance is going to be reclaimed by a cloud provider and re-scheduling the workload from that cloud provider onto a new spot instance.
Amiram Shachar is the CEO of Spotinst. He joins the show to talk about the different types of instances across cloud providers, the engineering behind Spotinst, and how the usage of containers and the rise of Kubernetes is changing the business landscape of the cloud.
The post Spot Instances with Amiram Shachar appeared first on Software Engineering Daily.

Jan 14, 2019 • 52min
Kubernetes in China with Dan Kohn
Chinese Internet companies operate at a massive scale.
WeChat has over a billion users and is widely used as the primary means of payment by urban Chinese consumers. Alibaba ships 12 million packages per day, which is four times the amount of Amazon. JD.com, a Chinese ecommerce company, has perhaps the largest production Kubernetes installation in the world.
China’s rapid adoption of Internet services, combined with a large population and a growing middle class has led to the creation of Internet giants on par with the social networks, ecommerce sites, and ridesharing startups of the United States.
Last November, I attended the first KubeCon China and saw firsthand how the Chinese Internet companies are using open source software to scale their infrastructure.
Despite the differences between the US and China, the culture of technologists at KubeCon felt familiar. In some ways, it was just like any other Kubernetes conference that I have attended: large numbers of engineers trying to find the cutting edge of technology, and learning how to solve the problems they are facing back at the office.
There were presentations on scaling databases and service meshes and machine learning on Kubernetes. Outside of these presentation halls, there were tables where you could pick up a translation device so that Chinese-only and English-only presentations could be understood by the other nationality.
Dan Kohn joins the show to talk about Chinese Internet companies and how they are adopting Kubernetes. Dan is the executive director of the Cloud Native Computing Foundation, an organization within the Linux Foundation that organizes KubeCon. Before joining the CNCF, Dan worked as an entrepreneur, engineer, and executive at several technology companies.
Transcript
Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Sponsors
Mesosphere’s Kubernetes-as-a-service provides single-click Kubernetes deployment with simple management, security features, and high availability to make your Kubernetes deployment easy. To find out how Mesosphere Kubernetes-as-a-Service can help you easily deploy Kubernetes, check out softwareengineeringdaily.com/mesospheretoday.
Manifold makes your life easier by providing a single workflow to organize your services, connect your integrations, and share with your team. While Manifold is completely free to use, if you head over to manifold.co/sedaily you’ll get a coupon code for $10 which you can use to try out any service on the Manifold marketplace.
Datadog is a cloud-scale monitoring platform for infrastructure and applications. And with Datadog’s new Live Container view, you can see every container’s health, resource consumption, and running processes in real time. See for yourself by starting a free trial and get a free Datadog T-shirt! softwareengineeringdaily.com/datadog.
Get ready to build content-rich websites and professional web applications with Wix Code. Store and manage unlimited data with built-in databases, create dynamic pages, make custom forms and take full control of your site’s functionality with Wix Code APIs and JavaScript. Plus, now you can get 10-percent off your Premium plan. Go to Wix.com/SED.
The post Kubernetes in China with Dan Kohn appeared first on Software Engineering Daily.

Jan 11, 2019 • 60min
AWS Analysis with Corey Quinn
Amazon Web Services changed how software engineers work. Before AWS, it was common for startups to purchase their own physical servers. AWS made server resources as accessible as an API request, and has gone on to create higher-level abstractions for building applications.
For the first few years of AWS, the abstractions were familiar. S3 provided distributed, reliable object storage. Elastic MapReduce provided a managed Hadoop system. Kinesis provided a scalable queue. Amazon provided developers with managed alternatives to complicated open source software.
More recently, AWS has started to release products that are unlike anything else. A perfect example is AWS Lambda, the first function-as-a-service platform. Other newer AWS products include Ground Station, a service for processing satellite data and AWS DeepRacer, a miniature race car for developers to build and test machine learning algorithms on.
As AWS has grown into new categories, the blog announcements of new services and features have started coming so frequently that it is hard to keep track of it all. Corey Quinn is the author of “Last Week in AWS”, a popular newsletter about what is changing across Amazon Web Services.
Corey joins the show to give his perspective on the growing, shifting behemoth that is Amazon Web Services–as well as the other major cloud providers that have risen to prominence. He’s also the host of the Screaming in the Cloud podcast, which you should check out if you like this episode.
The post AWS Analysis with Corey Quinn appeared first on Software Engineering Daily.

Jan 10, 2019 • 1h 3min
Zeit: Serverless Cloud with Guillermo Rauch
Serverless computing is a technique for deploying applications without an addressable server.
A serverless application is running on servers, but the developer does not have access to the server in the traditional sense. The developer is not dealing with IP addresses and configuring instances of their different services to be able to scale.
Just as higher level languages like C abstracted away the necessity of a developer to work with assembly code, serverless computing gives a developer more leverage by letting them focus on business logic while a serverless platform takes care of deployment, uptime, autoscaling, and other aspects of cloud computing that are fundamental to every application.
“Serverless” can several different things: backend-as-a-service products like Firebase, functions-as-a-service like AWS Lambda, and high-level APIs such as Twilio.
Zeit is a deployment platform built for serverless development. In Zeit, users model a GitHub repository in terms of the functions within their application. Zeit deploys the code from those functions onto functions-as-a-service and allows you to run your code across all the major cloud providers.
Guillermo Rauch is the founder of Zeit, and he joins the show to discuss his vision for the company and the platform as it looks today. Guillermo was previously on the show to discuss Socket.io, which he created.
The post Zeit: Serverless Cloud with Guillermo Rauch appeared first on Software Engineering Daily.

Jan 9, 2019 • 48min
Cloud Events with Doug Davis
Functions-as-a-service allow developers to run their code in a “serverless” environment. A developer can provide a function to a cloud provider and the code for that function will be scheduled onto a container and executed whenever an event triggers that function.
An “event” can mean many different things. It is a signal that something has changed within your application. When you save a file to an Amazon S3 bucket, that creates an event. When a user signs up for your app, that can create an event.
Functions-as-a-service are allowing people to build applications completely out of managed cloud infrastructure. Apps can be fully “serverless”, with managed databases, queueing systems, and APIs tied together by event-triggered functions.
Today, there is not a consistent format for events across different applications and cloud providers. This makes it more difficult to stitch together events across these different environments. Ideally, events would be lightweight, easy to deserialize, and easy to interoperate with.
The Cloud Events specification is a project within the Cloud Native Computing Foundation with the goal of creating a standard format for events. Doug Davis is the CTO for developer advocacy of containers at IBM. He joins the show to discuss how events and event-based programming works, and the need for a common format across cloud events.
The post Cloud Events with Doug Davis appeared first on Software Engineering Daily.