Cloud Engineering Archives - Software Engineering Daily cover image

Cloud Engineering Archives - Software Engineering Daily

Latest episodes

undefined
Apr 24, 2019 • 44min

gVisor: Secure Container Sandbox with Yoshi Tamura

RECENT UPDATES: Podsheets is our open source set of tools for managing podcasts and podcast businesses New version of Software Daily, our app and ad-free subscription service FindCollabs is hiring a React developer FindCollabs Hackathon #1 has ended! Congrats to ARhythm, Kitspace, and Rivaly for winning 1st, 2nd, and 3rd place ($4,000, $1000, and a set of SE Daily hoodies, respectively). The most valuable feedback award and the most helpful community member award both go to Vynce Montgomery, who will receive both the SE Daily Towel and the SE Daily Old School Bucket Hat. The Linux operating system includes user space and kernel space. In user space, the user can create and interact with a variety of applications directly. In kernel space, the Linux kernel provides a stable environment in which device drivers interact with hardware and manage low level resources. A Linux container is a virtualized environment that runs within user space.  To perform an operation, a process in a container in user space makes a syscall (system call) into kernel space. This allows the container to have access to resources like memory and disk. Kernel space must be kept secure to ensure operating system integrity–but Linux includes hundreds of syscalls. Each syscall represents an interface between the user space and kernel space. Security vulnerabilities can emerge from this wide attack surface of different syscalls, and most applications only need a small number of syscalls to perform their required functionality. gVisor is a project to restrict the number of syscalls that the kernel and user space need to communicate. gVisor is a runtime layer between the user space container and the kernel space. gVisor reduces the number of syscalls that can be made into kernel space. The security properties of gVisor make it an exciting project today–but it is the portability features of gVisor that hint at a huge future opportunity. By inserting an interpreter interface between containers and the Linux kernel, gVisor presents the container world with the opportunity to run on operating systems other than Linux. There are many reasons why it might be appealing to run containers on an operating system other than Linux. Linux was built many years ago, before the explosion of small devices, smart phones, IoT hubs, voice assistants and smart cars. To be more speculative, Google is working on a secretive new operating system called Fuscia. gVisor could be a layer that allows workloads to be ported from Linux servers to Fuscia servers. Yoshi Tamura is a product manager at Google with a background in containers and virtualization. He joins the show to talk about gVisor and the different kinds of virtualization. The post gVisor: Secure Container Sandbox with Yoshi Tamura appeared first on Software Engineering Daily.
undefined
Apr 23, 2019 • 1h 3min

Observability Engineering with James Burns

RECENT UPDATES: Podsheets is our open source set of tools for managing podcasts and podcast businesses New version of Software Daily, our app and ad-free subscription service FindCollabs is hiring a React developer FindCollabs Hackathon #1 has ended! Congrats to ARhythm, Kitspace, and Rivaly for winning 1st, 2nd, and 3rd place ($4,000, $1000, and a set of SE Daily hoodies, respectively). The most valuable feedback award and the most helpful community member award both go to Vynce Montgomery, who will receive both the SE Daily Towel and the SE Daily Old School Bucket Hat Twilio is a communications infrastructure company with thousands of internal services and thousands of request per second. Each request generates logs, metrics, and distributed traces which can be used to troubleshoot failures and improve latency. Since Twilio is used for 2-factor authentication and text message relaying, Twilio is critical infrastructure for most applications that implement it. The service must remain highly available even in times of peak application traffic, or outages at a particular cloud provider. When he was at Twilio, James Burns worked on platform infrastructure and observability. James was at Twilio from 2014 to 2017, a time in which the company experienced rapid scalability. His work encompassed site reliability, monitoring, cost management and incident response. He also led chaos engineering exercises called “game days”, in which the company deliberately caused infrastructure to fail in order to ensure the reliability of failover systems and to discover problematic dependencies. James joins the show to talk about his time at Twilio and his perspectives on how to instrument and observe complex applications. Full disclosure: James now works at LightStep, which is a sponsor of Software Engineering Daily. The post Observability Engineering with James Burns appeared first on Software Engineering Daily.
undefined
Apr 22, 2019 • 49min

Serverless Runtimes with Steren Giannini

RECENT UPDATES: Podsheets is our open source set of tools for managing podcasts and podcast businesses New version of Software Daily, our app and ad-free subscription service FindCollabs is hiring a React developer FindCollabs Hackathon #1 has ended! Congrats to ARhythm, Kitspace, and Rivaly for winning 1st, 2nd, and 3rd place ($4,000, $1000, and a set of SE Daily hoodies, respectively). The most valuable feedback award and the most helpful community member award both go to Vynce Montgomery, who will receive both the SE Daily Towel and the SE Daily Old School Bucket Hat. Google’s options for running serverless workloads started with App Engine. App Engine is a way to deploy an application in a fully managed environment. Since the early days of App Engine, managed infrastructure has matured and become more granular. We now have serverless databases, queueing systems, machine learning tools, and functions as a service. Developers can create fully managed, event-driven, highly scalable systems with less code and less operations. Different cloud providers are taking different approaches to offering serverless runtimes. Google’s approach involves the open source Knative project and a hosted platform for running Knative workloads called Cloud Run. Steren Giannini is a product manager at Google working on serverless tools. He joins the show to discuss Google’s serverless projects and the implementation details in building them. The post Serverless Runtimes with Steren Giannini appeared first on Software Engineering Daily.
undefined
Apr 8, 2019 • 49min

AWS Storage with Kevin Miller

RECENT UPDATES: FindCollabs $5000 Hackathon Ends Saturday April 15th, 2019 New version of Software Daily, our app and ad-free subscription service Software Daily is looking for help with Android engineering, QA, machine learning, and more   A software application requires compute and storage. Both compute and storage have been abstracted into cloud tools that can be used by developers to build highly available distributed systems. In our previous episode, we explored the compute side. In today’s episode we discuss storage. Application developers store data in a variety of abstractions. In-memory caches allow for fast lookups. Relational databases allow for efficient retrieval of well-structured tables. NoSQL databases allow for retrieval of documents that may have a less defined schema. File storage systems allow the access pattern of nested file systems, like on your laptop. Distributed object storage systems allow for highly durable storage of any data type. Amazon S3 is a distributed object storage system with a wide spectrum of use cases. S3 is used for media file storage, archiving of log files, and data lake applications. S3 functionality has increased over the years, developing different tiers of data retrieval latency and cost structure. AWS S3 Glacier allows for long-term storage of data at a large cost reduction, in exchange for increased latency of data access. Kevin Miller is the general manager of Amazon Glacier at Amazon Web Services. He joins the show to talk about the history of storage, the different options for storage in the cloud, and the design of S3 Glacier. The post AWS Storage with Kevin Miller appeared first on Software Engineering Daily.
undefined
Apr 5, 2019 • 51min

AWS Compute with Deepak Singh

Upcoming event: FindCollabs Hackathon at App Academy on April 6, 2019 On Amazon Web Services, there are many ways to run an application on a single node. The first compute option on AWS was the EC2 virtual server instance. But EC2 is a large abstraction compared to what many people need for their nodes–which is a container with a smaller set of resources to work with. Containers can be run within a managed cluster like ECS or EKS, or run on their own as AWS Fargate instances, or simply as Docker containers running without a container orchestration tool. Beyond the option of explicit container instances, users can run their application as a “serverless” function-as-a-service such as AWS Lambda. Functions-as-a-service abstract away the container and let the developer operate at a higher level, while also providing some cost savings. Developers use these different compute options for different reasons. Deepak Singh is the director of compute services at Amazon Web Services, and he joins the show to discuss the use cases and tradeoffs of these options. Deepak also discusses how these tools are useful internally to AWS. ECS and Lambda are high-level APIs that are used to build even higher level services such as AWS Batch, which is a service for performing batch processing over large data sets. The post AWS Compute with Deepak Singh appeared first on Software Engineering Daily.
undefined
Apr 1, 2019 • 1h 3min

Uber Infrastructure with Prashant Varanasi and Akshay Shah

Upcoming events: A Conversation with Haseeb Qureshi at Cloudflare on April 3, 2019 FindCollabs Hackathon at App Academy on April 6, 2019 Uber’s infrastructure supports millions of riders and billions of dollars in transactions. Uber has high throughput and high availability requirements, because users depend on the service for their day-to-day transportation. When Uber was going through hypergrowth in 2015, the number of services was growing rapidly, as was the load across those services. Using a cloud provider was a risky option, because the costs could potentially grow out of control. Uber made a decision early on to invest in physical hardware in order to keep costs at a reasonable level. In the last 3 years, Uber’s infrastructure has stabilized. The platform engineering team has built systems for monitoring, deployment, and service proxying. Developing and maintaining microservices within Uber has become easier. Prashant Varanasi and Akshay Shah are engineers who have been with Uber for more than three years. They work on Uber’s platform engineering team, and their current focus is on the service proxy layer, a sidecar that runs alongside Uber services providing features such as load balancing, service discovery, and rate limiting. Prashant and Akshay join the show to talk about Uber infrastructure, microservices, and the architecture of a service proxy. We also talk in detail about the benefits of using Go for critical systems infrastructure, and some techniques for profiling and debugging in Go. The post Uber Infrastructure with Prashant Varanasi and Akshay Shah appeared first on Software Engineering Daily.
undefined
Mar 29, 2019 • 44min

Workload Scheduling with Brian Grant

Upcoming events: A Conversation with Haseeb Qureshi at Cloudflare on April 3, 2019 FindCollabs Hackathon at App Academy on April 6, 2019 Google has been building large-scale scheduling systems for more than fifteen years. Google Borg was started around 2003, giving engineers at Google a unified platform to issue long-lived service workloads as well as short-lived batch workloads onto a pool of servers. Since the early days of Borg, the scheduler systems built by Google have matured through several iterations. Omega was an effort to improve the internal Borg system, and Kubernetes is an open source container orchestrator built with the learnings of Borg and Omega. A scheduling system needs to be able to accept a wide variety of workload types and find compute resources within a cluster to schedule those workloads onto. There is a wide variety of potential workloads that could be scheduled–batch jobs, stateful services, stateless services, and daemon services. Different workloads can have different priority levels. A high priority workload should be able to find compute resources quickly, and a low priority workload can wait longer to find resources. Brian Grant is a principal engineer at Google. He joins the show to talk about his experience building workload schedulers and designing APIs for engineers to interface with those schedulers.     The post Workload Scheduling with Brian Grant appeared first on Software Engineering Daily.
undefined
Mar 28, 2019 • 47min

Peloton: Uber’s Cluster Scheduler with Min Cai and Mayank Bansal

Upcoming events: A Conversation with Haseeb Qureshi at Cloudflare on April 3, 2019 FindCollabs Hackathon at App Academy on April 6, 2019 Google’s Borg system is a cluster manager that powers the applications running across Google’s massive infrastructure. Borg provided inspiration for open source tools like Apache Mesos and Kubernetes. Over the last decade, some of the largest new technology companies have built their own systems that fulfill the roles of cluster management and resource scheduling. Netflix, Twitter, and Facebook have all spoken about their internal projects to make distributed systems resource allocation more economical. These companies find themselves continually reinventing scheduling and orchestration, with inspiration from Google Borg and their own internal experiences running large numbers of containers and virtual machines. Uber’s engineering team has built a cluster scheduler called Peloton. Peloton is based on Apache Mesos, and is architected to handle a wide range of workloads: data science jobs like Hadoop MapReduce; long running services such as a ridesharing marketplace service; monitoring daemons such as Uber’s M3 collector; and database services such as MySQL. Min Cai and Mayank Bansal are engineers at Uber who work on Peloton. When they set out to create Peloton, they looked at the existing schedulers in the ecosystem, including Kubernetes, Mesos, Hadoop’s YARN system, and Borg itself. Both Min and Mayank join the show today to give a brief history of distributed systems schedulers and discuss their work on Peloton. They have been working in the world of distributed systems schedulers for many years–including experiences building core Hadoop infrastructure and virtual machine schedulers at VMware. The post Peloton: Uber’s Cluster Scheduler with Min Cai and Mayank Bansal appeared first on Software Engineering Daily.
undefined
Mar 8, 2019 • 55min

Netlify with Mathias Biilmann Christensen

Cloud computing started to become popular in 2006 with the release of Amazon EC2, a system for deploying applications to virtual machines sitting on remote data center infrastructure . With cloud computing, application developers no longer needed to purchase expensive server hardware. Creating an application for the Internet became easier, cheaper, and simpler. As the cloud has become popular, new ways of deploying applications have emerged. A developer with a web app has so many different options. You can host your app on an Amazon EC2 server, which will require you to manage cloud infrastructure in case your server crashes. You can deploy your app to Heroku, which gives your cloud deployment better uptime guarantees for a higher price than Amazon EC2. Or you can use Linode, or Microsoft Azure, or Google Cloud. There is such a large market for cloud computing that the world of cloud providers serves more niches every year. In past episodes we have explored a variety of different cloud providers, and the markets they target. Pivotal Cloud Foundry is for managing complex distributed systems applications, typically with large teams. Firebase is a cloud provider that simplifies the developer experience for applications with small teams. Spotinst is a cloud provider that emphasizes low cost. Zeit is a cloud provider that is built to manage applications through serverless “functions-as-a-service” like AWS Lambda. In today’s episode, Mathias Biilman Christensen, CEO of Netlify, joins the show. Netlify is a cloud provider that was built for modern web projects. Netlify represents the convergence of several trends in software development converging: static site deployment, serverless functions, a desire to have a “no-ops” deployment with minimal management, and the rise of newer tools like GraphQL and Gatsby. Mathias explores these trends in detail, and explores the technical challenges of building Netlify. He was a great guest, capable of talking about difficult backend problems that require writing C++, as well as the frontend world of JavaScript frameworks. One announcement before we begin: we are having a $5000 hackathon. The $5000 hackathon is for a new product we’ve been working on: FindCollabs. FindCollabs is a platform for finding collaborators and building projects. Whether you are an engineer, a musician, a designer, a videographer, or an artist, FindCollabs lets you find people and collaborate. To try out FindCollabs, just go to FindCollabs.com, you can make a project or you can join someone else’s project. And it’s very easy to make these projects–you don’t need to have anything built yet–you need to have a vision for what you want to build. And to find out about the hackathon, go to findcollabs.com/hackathon. We are giving away $5000 in cash to the coolest projects that get built before Sunday April 14th. So I recommend getting started early, finding some people to collaborate with, and building some cool stuff! The post Netlify with Mathias Biilmann Christensen appeared first on Software Engineering Daily.
undefined
Feb 14, 2019 • 48min

Kubernetes Security with Liz Rice

A Kubernetes cluster presents multiple potential attack surfaces: the cluster itself, a node running on the cluster, a pod running in the node, a container running in a pod. If you are managing your own Kubernetes cluster, you need to be aware of the security settings on your etcd, your API server, and your container build pipeline. Many of the security risks of a Kubernetes cluster can be avoided by using the default settings of Kubernetes, or by using a managed Kubernetes service from a cloud provider or an infrastructure company. But it is useful to know about the fundamentals of operating a secure cluster, so that you can hopefully avoid falling victim to the most common vulnerabilities. Liz Rice wrote the book Kubernetes Security with co-author Michael Hausenblas. Liz works at Aqua Security, a company that develops security tools for containerized applications. In today’s show, Liz gives an overview of the security risks of a Kubernetes cluster, and provides some best practices including secret management, penetration testing, and container lifecycle management.   Show Notes Kubernetes Security by Michael Hausenblas, Liz Rice – O’Reilly Media Open Source Security Podcast – Talking about Kubernetes and container security with Liz Rice Keynote: Running with Scissors – Liz Rice, Technology Evangelist, Aqua Security The post Kubernetes Security with Liz Rice appeared first on Software Engineering Daily.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app