The New Stack Podcast

The New Stack
undefined
5 snips
Jan 11, 2023 • 23min

What’s Platform Engineering? And How Does It Support DevOps?

Platform engineering “is the art of designing and binding all of the different tech and tools that you have inside of an organization into a golden path that enables self service for developers and reduces cognitive load,” said Kaspar Von Grünberg, founder and CEO of Humanitec, in this episode of The New Stack Makers podcast.  This is structure is important for individual contributors, Grünberg said, as well as backend engineers: “if you look at the operation teams, it reduces their burden to do repetitive things. And so platform engineers build and design internal developer platforms, and help and serve users. “ This conversation, hosted by Heather Joslyn, TNS features editor, dove into platform engineering: what it is, how it works, the problems it is intended to solve, and how to get started in building a platform engineering operation in your organization. It also debunks some key fallacies around the concept. This episode was sponsored by Humanitec.The Limits of ‘You Build It, You Run It’The notion of “you build it, you run it” — first coined by Werner Vogels, chief technology officer of [sponsor_inline_mention slug="amazon-web-services-aws" ]Amazon,[/sponsor_inline_mention] in a 2006 interview — established that developers should “own” their applications throughout their entire lifecycle. But, Grünberg said, that may not be realistic in an age of rapidly proliferating microservices and multiple, distributed deployment environments. “The scale that we're operating today is just totally different,” he said. “The applications are much more complex.” End-to-end ownership, he added, is “a noble dream, but unfair towards the individual contributor. We're asking developers to do so much at once. And then we're always complaining that the output isn't there or not delivering fast enough. But we're not making it easy for them to deliver.” Creating a “golden path” — though the creation by platform teams of internal developer platforms (IDPs) — can not only free developers from unnecessary cognitive load, Grünberg said, but also help make their code more secure and standardized. For Ops engineers, he said, the adoption of platform engineering can also help free them from doing the same tasks over and over. “If you want to know whether it's a good idea to look at platform engineering, I recommend go to your service desk and look at the tickets that you're receiving,”  Grünberg said. “And if you have things like, ‘Hey, can you debug that deployment?’ and ‘Can you spin up in a moment all these repetitive requests?’ that's probably a good time to take a step back and ask yourself, ‘Should the operations people actually spend time doing these manual things?’”The Biggest Fallacies about Platform EngineeringFor organizations that are interested in adopting platform engineering, the Humanitec CEO attacked some of the biggest misconceptions about the practice. Chief among them: failing to treat their platform as a product, in the same way a company would begin creating any product, by starting with research into customer needs. “If you think about how we would develop a software feature, we wouldn't be sitting in a room and taking some assumptions and then building something,” he said. “We would go out to the user, and then actually interview them and say, ‘Hey, what's your problem? What's the most pressing problem?’” Other fallacies embraced by platform engineering newbies, he said, are “visualization” — the belief that all devs need is another snazzy new dashboard or portal to look at — and believing the platform team has to go all-in right from the start, scaling up a big effort immediately. Such an effort, he said is “doomed to fail.” Instead, Grünberg said, “I'm always advocating for starting really small, come up with what's the most lowest common tech denominator. Is that containerization with EKS? Perfect, then focus on that." And don’t forget to give special attention to those early adopters, so they can become evangelists for the product. “make them fans, prioritize the right way, and then show that to other teams as a, ‘Hey, you want to join in? OK, what's the next cool thing we could build?’” Check out the entire episode for much more detail about platform engineering and how to get started with it.
undefined
4 snips
Jan 4, 2023 • 29min

What LaunchDarkly Learned from 'Eating Its Own Dog Food'

Feature flags — the on/off toggles, written in conditional statements, that allow organizations greater control over the user experience once code has been deployed —  are proliferating and growing more complex, and demand robust feature management, said Karishma Irani, head of product at LaunchDarkly, in this episode of The New Stack Makers. In a November survey by LaunchDarkly, which queried more than 1,000 DevOps professionals,  69% of participants said that feature flags are “must-have, mission-critical and/or high priority” for their organizations. “Feature management, we believe, is a modern practice that's becoming more and more common with companies that want to deploy more frequently, innovate faster, and just keep a healthy engineering team,” Irani said. The idea of feature management, Irani said, is to “maximize value while minimizing risk.” LaunchDarkly uses its own software, she said, and eating its own dog food, as the saying goes, has paid off in gaining insights into user needs. As part of LaunchDarkly’s virtual conference Trajectory in November, Irani joined Heather Joslyn, features editor of The New Stack, for a wide-ranging conversation about the latest developments in feature management. This episode of Makers was sponsored by LaunchDarkly.Automating ApprovalsAs an example of the benefits of having first-hand knowledge of how their company's products are used, Irani pointed to an internal project in mid-2022. When the company migrated from [sponsor_inline_mention slug="mongodb" ]MongoDB[/sponsor_inline_mention] to CockroachDB, it used new capabilities in its Feature Workflows product, which allow users to define a workflow that can schedule the gradual release of a feature flag for a future date and time, and automate approval requests. “All of these async processes around approvals schedules, they're critical to releasing software, but they do slow you down and add more potential for manual error or human error,” Irani said. “And so our goal with Feature Workflows was to essentially automate the entire process of a feature release.”Overhauling ExperimentationThis past June, the company also revised its Experimentation offering, she said. Led by James Frost, LaunchDarkly’s head of experimentation, the team did “a complete overhaul of our stats engine, they enhanced the integration path of our customers’ existing data sets and metrics,” Irani said. “They redesigned our UX and the codified model and experimentation best practices into the product itself.” For instance, a new metric import API helps prevent the problem of multiple teams or users within a company using different tools for A/B and other experiments. It “significantly cuts down on manual duplicate work when importing metrics for experimentation,” said Irani. “So you can get set up faster.” Another addition to the Experimentation product is a sample ratio mismatch test, she said, so “you can be confident that all of your experiments are correctly allocating traffic to each variant.” These innovations, along with new capabilities to the company’s Core Flagging Platform, are in general availability. On the horizon — and now available through LaunchDarkly’s early access program, is Accelerate, which lets users track and visualize key engineering metrics, such as deployment frequency, release frequency, lead time for code changes, and flag coverage. “I'm sure you've caught on already,” Irani said, “but a few of these are Dora metrics, which obviously are extremely critical to our users.” Check out the entire episode for more details on what’s new from LaunchDarkly and the problems that innovators in the feature management space still need to solve.
undefined
Dec 28, 2022 • 15min

Hazelcast and the Benefits of Real Time Data

In this latest podcast from The New Stack, we interview Manish Devgan, chief product officer for Hazelcast, which offers a real time stream processing engine. This interview was recorded at KubeCon+CloudNativeCon, held last October in Detroit. "'Real time' means different things to different people, but it's really a business term," Devgan explained. In the business world, time is money, and the more quickly you can make a decision, using the right data, the more quickly one can take action. Although we have many "batch-processing" systems, the data itself rarely comes in batches, Devgan said. "A lot of times I hear from customers that are using a batch system, because those are the things which are available at that time. But data is created in real time sensors, your machines, espionage data, or even customer data — right when customers are transacting with you." What is a Real Time Data Processing Engine? A real time data processing engine can analyze data as it is coming in from the source. This is different from traditional approaches that store the data first, then analyze it later. Bank loans may is example of this approach. With a real time data processing engine in place, a bank can offer a loan to a customer using an automated teller machine (ATM) in real time, Devgan suggested.  "As the data comes in, you can actually take action based on context of the data," he argued. Such a loan app may combine real-time data from the customer alongside historical data stored in a traditional database. Hazelcast can combine historical data with real time data to make workloads like this possible. In this interview, we also debated the merits of Kafka, the benefits of using a managed service rather than running an application in house, Hazelcast's users, and features in the latest release of the Hazelcast platform.        
undefined
Dec 22, 2022 • 27min

Hachyderm.io, from Side Project to 38,000+ Users and Counting

Back in April, Kris Nóva, now principal engineer at GitHub, started creating a server on Mastodon as a side project in her basement lab. Then in late October, Elon Musk bought Twitter for an eye-watering $44 billion, and began cutting thousands of jobs at the social media giant and making changes that alienated longtime users. And over the next few weeks, usage of Nóva’s hobby site, Hachyderm.io, exploded. “The server started very small,” she said on this episode of The New Stack Makers podcast. “And I think like, one of my friends turned into two of my friends turned into 10 of my friends turned into 20 colleagues, and it just so happens, a lot of them were big names in the tech industry. And now all of a sudden, I have 30,000 people I have to babysit.” Though the rate at which new users are joining Hachyderm has slowed down in recent days, Nóva said, it stood at more than 38,000 users as of Dec. 20. Hachyderm.io is still run by a handful of volunteers, who also handle content moderation. Nóva is now seeking nonprofit status for it with the U.S. Internal Revenue Service, with intentions of building a new organization around Hachyderm. This episode of Makers, hosted by Heather Joslyn, TNS features editor, recounts Hachyderm’s origins and the challenges involved in scaling it as Twitter users from the tech community gravitated to it. Nóva and Joslyn were joined by Gabe Monroy, chief product officer at DigitalOcean, which has helped Hachyderm cope with the technical demands of its growth spurt.HugOps and Solving Storage IssuesSuddenly having a social media network to “babysit” brings numerous challenges, including the technical issues involved in a rapid scale up. Monroy and Nóva worked on Kubernetes projects when both were employed at Microsoft, “so we’re all about that horizontal distribution life.” But the Mastodon application’s structure proved confounding. “Here I am operating a Ruby on Rails monolith that's designed to be vertically scaled on a single piece of hardware,” Nóva said. “And we're trying to break that apart and run that horizontally across the rack behind me. So we got into a lot of trouble very early on by just taking the service itself and starting to decompose it into microservices.” Storage also rapidly became an issue. “We had some non-enterprise but consumer-grade SSDs. And we were doing on the order of millions of reads and writes per day, just keeping the Postgres database online. And that was causing cascading failures and cascading outages across our distributed footprint, just because our Postgres service couldn't keep up.” DigitalOcean helped with the storage issues; the site now uses a data center in Germany, whose servers DigitalOcean manages. (Previously, its servers had been living in Nóva’s basement lab.) Monroy, longtime friends with Nóva, was an early Hachyderm user and reached out when he noticed problems on the site, such as when he had difficulty posting videos and noticed other people complaining about similar problems. “This is a ‘success failure’ in the making here, the scale of this is sort of overwhelming,” Monroy said. “So I just texted Nóva, ‘Hey, what's going on? Anything I could do to help?’ “In the community, we like to talk about the concept of HugOps, right? When people are having issues on this stuff, you reach out, try and help. You give a hug. And so, that was all I did. Nóva is very crisp and clear: This is what I got going on. These are the issues. These are the areas where you could help.”Sustaining ‘the NPR of Social Media’One challenge in particular has nudged Nóva to seek nonprofit status: operating costs. “Right now, I'm able to just kind of like eat the cost myself,” she said. “I operate a Twitch stream, and we're taking the proceeds of that and putting it towards operating service.” But that, she acknowledges, won’t be sustainable as Hachyderm grows. “The whole goal of it, as far as I'm concerned, is to keep it as sustainable as possible,” Nóva said. “So that we're not having to offset the operating costs with ads or marketing or product marketing. We can just try to keep it as neutral and, frankly, boring as possible — the NPR of social media, if you could imagine such a thing.” Check out the full episode for more details on how Hachyderm is scaling and plans for its future, and Nóva and Monroy’s thoughts about the status of Twitter.  Feedback? Find me at @hajoslyn on Hachyderm.io.
undefined
Dec 20, 2022 • 23min

Automation for Cloud Optimization

During the pandemic, many organizations sped up their move to the cloud — without fully understanding the costs, both human and financial, they would pay for the convenience and scalability of a digital transformation. “They really didn’t have a baseline,” said Mekka Williams, principal engineer, at Spot by NetApp, in this episode of The New Stack Makers podcast. “And so the those first cloud bills, I'm sure were shocking, because you don't get a cloud bill, when you run on your on-premises environment, or even your private cloud, where you've already paid the cost for the infrastructure that you're using. What’s especially worrisome is that many of those costs are simply wasted, Williams said. “Most of the containerized applications running in Kubernetes clusters are running underutilized,” she said. “And anything that's underutilized in the cloud equates to waste. And if we want to be really lean and clean and use resources in a very efficient manner, we have to have really good cloud strategy in order to do that.” This episode of The New Stack Makers, hosted by Heather Joslyn, TNS features editor, focused on CloudOps, which in this case stands for “cloud operations.” (It can also stand for “cloud optimization,” but more about that later.) The conversation was sponsored by Spot by NetApp. Automation for Cloud Optimization Many organizations that moved quickly to the cloud during the dog days of the pandemic have begun to revisit the decisions they made and update their strategies, Williams said. “We see some organizations that are trying to modernize their applications further, to make better use of the services that are available in the cloud,” she said. “The cloud is getting more complex as they grow and mature in their journey. “And so they're looking for ways to simplify their operations. And as always keep their costs down. Keep things simple for their DevOps and SRE, to  is not incur additional technical debt, but still make the most make the best use out of their cloud, wherever they are.” Automation holds the key to CloudOps — both definitions — according to Williams. For starters, it makes teams more efficient. “The less tasks that your workforce have to perform manually, the more time they have to spend focused on business logic and being innovative,” Williams said. “Automation also helps you with repeatability. And it's less error-prone, and it helps you standardize. Really good automation simplifies your environment greatly.” Automating repetitive tasks can also help prevent your site reliability engineers (SREs) from burnout, she said. Practicing “good data hygiene,” Williams said, also helps contain costs and reduce toil: “Making sure you're using the right tier of data, making sure you're not over-provisioned. And the type of storage you need, you don't need to pay top dollar for high-performing storage, if it's just backup data that doesn't get accessed that often.” Such practices are “good to know on-premises, but these are imperative to know when you're in the cloud,” she said, in order to reduce waste. During this episode, Williams pointed to solutions in the Spot by Netapp portfolio that use automation to help make the most of cloud infrastructure, such as its flagship product, Elastigroup, which takes advantage of excess capacity to scale workloads. In June, Spot by NetApp acquired Instaclustr, a solution for managing open source database and streaming technologies. The company recognizes the growing importance of open source for enterprises. “We're paying attention to trends for cloud applications,” Williams said, “and we're growing the portfolio to address the needs that are top of mind for those customers.” Check out the entire episode to learn more about CloudOps.
undefined
Dec 14, 2022 • 41min

Redis Looks Beyond Cache Toward Everything Data

Redis, best known as a data cache or real-time data platform, is evolving into much more, Tim Hall, chief of product at the company told The New Stack in a recent TNS Makers podcast. Redis is an in-memory database or memory-first database, which means the data lands there and people are using us for both caching and persistence. However, these days, the company has a number of flexible data models, but one of the brand promises of Redis is developers can store the data as they're working with it. So as opposed to a SQL database where you might have to turn your data structures into columns and tables, you can actually store the data structures that you're working with directly into Redis, Hall said.  Primary Database? “About 40% of our customers today are using us as a primary database technology,” he said. “That may surprise some people if you're sort of a classic Redis user and you knew us from in-memory caching, you probably didn't realize we added a variety of mechanisms for persistence over the years.” Meanwhile, to store the data, Redis does store it on disk, sort of behind the scenes while keeping a copy in memory. So if there's any sort of failure, Redis can recover the data off of disk and replay it into memory and get you back up and running. That's a mechanism that has been around about half a decade now. Yet, Redis is playing what Hall called the ‘long game', particularly in terms of continuing to reach out to developers and showing them what the latest capabilities are. “If you look at the top 10 databases on the planet, they've all moved into the multimodal category. And Redis is no different from that perspective” Hall said. “So if you look at Oracle it was traditionally a relational database, Mongo is traditionally JSON documents store only, and obviously Redis is a key-value store. We've all moved down the field now. Now, why would we do that? We're all looking to simplify the developer’s world, right?” Yet, each vendor is really trying to leverage their core differentiation and expand out from there. And the good news for Redis is speed is its core differentiation. “Why would you want a slow data platform? You don't, Hall said. “So the more that we can offer those extended capabilities for working with things like JSON, or we just launched a data structure called t-digest, that people can use along and we've had support for Bloom filter, which is a probabilistic data structure like all of these things, we kind of expand our footprint, we're saying if you need speed, and reducing latency, and having high interactivity is your goal Redis should be your starting point. If you want some esoteric edge case functionality where you need to manipulate JSON in some very strange way, you probably should go with Mongo. I probably won't support that for a long time. But if you're just working with the basic data structures, you need to be able to query, you need to be able to update your JSON document. Those straightforward use cases we support very, very well, and we support them at speed and scale.” Customer View As a Redis customer, Alain Russell, CEO at Blackpepper, a digital e-commerce agency in Auckland, New Zealand, said his firm has undergone the same transition. “We started off as a Redis as a cache, that helped us speed up traditional data that was slower than we wanted it,” he said. “And then we went down a cloud path a couple of years ago. Part of that migration included us becoming, you know, what's deemed as ‘cloud native.’ And we started using all of these different data stores and data structures and dealing with all of them is actually complicated. You know, and from a developer perspective, it can be a bit painful.” So, Blackpepper started looking for how to make things simpler, but also keep their platform very fast and they looked at the Redis Stack. “And honestly, it filled all of our needs in one platform. And we're kind of in this path at the moment, we were using the basics of it. And we're very early on in our journey, right? We're still learning how things work and how to use it properly. But we also have a big list of things that we're using other data stores for traditional data, and working out, okay, this will be something that we will migrate to, you know, because we use persistent heavily now, in Redis.” Twenty-year-old Blackpepper works with predominantly traditional retailers and helps them in their omni-channel journey. Commercial vs. Open Source Hall said there are three modes of access to the Redis technology: the Redis open source project, the Redis Stack – which the company recommends that developers start with today -- and then there's Redis Enterprise Edition, which is available as software or in the cloud. “It's the most popular NoSQL database on the planet six years running,” Hall said. “And people love it because of its simplicity.” Meanwhile, it takes effort to maintain both the commercial product and the open source effort. Allen, who has worked at Hortonworks, InfluxData, said “Not every open source company is the same in terms of how you make decisions about what lands in your commercial offering and what lands in open source and where the contributions come from and who's involved.” For instance, “if there was something that somebody wanted to contribute that was going to go against our commercial interest, we probably not would not merge that,” Hall said. Redis was run by project founder Salvatore Sanfilippo, for many, many years, and he was the sole arbiter of what landed and what did not land in Redis itself. Then, over the last couple of years, Redis created a core steering committee. It's made up of one individual from AWS, one individual from Alibaba, and three Redis employees who look after the contributions that are coming in from the Redis open source community members who want to contribute those things. “And then we reconcile what we want from a commercial interest perspective, either upstream, or things that, frankly, may have been commoditized and that we want to push downstream into the open source offering, Hall said. “And so the thing that you're asking about is sort of my core existential challenge all the time, that is figuring out where we're going from a commercial perspective. What do we want to land there first? And how can we create a conveyor belt of commercial opportunity that keeps us in business as a software company, creating differentiation against potential competitors show up? And then over time, making sure that those things that do become commoditized, or maybe are not as differentiating anymore, I want to release those to the open source community. But this upstream/downstream kind of challenge is something that we're constantly working through.” Blackpepper was an open source Redis user initially, but they started a journey where they used Memcached to speed up data. Then they migrated to Redis when they moved to the AWS cloud, Russell said. Listen to the Podcast The Redis TNS Makers podcast goes on to look at the use of AI/ML in the platform, the acquisition of RESP.app, the importance of JSON and RediSearch, and where Redis is headed in the future.
undefined
Dec 7, 2022 • 26min

Couchbase’s Managed Database Services: Computing at the Edge

Let’s say you’re a passenger on a cruise ship. Floating in the middle of the ocean, far from reliable Wi-Fi, you wear a device that lets you into your room, that discreetly tracks your move from the bar to the dinner table to the pool and delivers your drink order wherever you are. You can buy sunscreen or toothpaste or souvenirs in the ship’s stores without touching anything. If you’re a Carnival Cruise Lines passenger, this is reality right now, in part because of the company’s partnership with Couchbase, according to Mark Gamble, product and solutions marketing director, Couchbase. Couchbase provides a cloud native, no SQL database technology that's used to power applications for customers including Carnival but also Amadeus, Comcast, LinkedIn, and Tesco. In Carnival’s case, Gamble said, “they run an edge data center on their ships to power their Ocean Medallion application, which they are super proud of. They use it a lot in their ads, because it provides a personalized service, which is a differentiator for them to their customers.” In this episode of The New Stack Makers, Gamble spoke to Heather Joslyn, features editor of TNS, about edge computing, 5G, and Couchbase Capella, its Database as a Service (DBaaS) offering for enterprises. This episode of Makers was sponsored by Couchbase.5G and Offline-First AppsThe goal of edge computing, Gamble told our podcast audience, is bring data and compute closer to the applications that consume it. This speeds up data processing, he said, “because data doesn't have to travel all the way to the cloud and back.” But it also has other benefits “This serves to make applications more reliable, because local data processing sort of removes internet slowness and outages from the equation,” he said. The innovation of 5G networks has also had a big impact on reducing latency and increasing uptime, Gamble said. “To compare with 4G, things like the average round trip data travel time between the device, and the cell tower is like 15 milliseconds. And with 5G, that latency drops to like two milliseconds. And 5G can support they say, a million devices, within a third of a mile radius, way more than what's possible with 4G.” But 5G, Gamble said, “really requires edge computing to realize its its full potential.” Increasingly, he said, Couchbase hears interest from its customers in building “offline-first” applications, which can run even in Wi-Fi dead zones. The use cases, he said, are everywhere: “When I pass a fast food restaurant, it's starting to become more common, where you'll see that, instead of just a box you're talking to, there's a person holding a tablet, and they walk down the line, and they're taking orders. And as they come closer to the restaurant, it syncs up with the kitchen. They find that just a better, more efficient way to serve customers. And so it becomes a competitive differentiator forum.” As part of Couchbase’s Capella product, it recently announced Capella App Service, a new capability for mobile developers, is a fully managed backend designed for mobile, Internet of Things (IoT) and edge applications. “Developers use it to access and sync data between the Database as a Service and their edge devices, as well as it handles authenticating and managing mobile and edge app users,” he said. Used in conjunction with Couchbase Lite, a lightweight, embedded NoSQL database used with mobile and IoT devices, Capella App Services synchronizes the data between backend and edge devices. Even for workers in remote areas, “eventually, you have to make sure that data updates are shared with the rest of the ecosystem,” Gamble said. “ And that's what App Services is meant to do, as conductivity allows — so during network disruptions in areas with no internet, apps will still continue to operate.” Check out the rest of the conversation to learn more about edge computing and the challenges Gamble thinks still need to be addressed in that space.
undefined
Dec 1, 2022 • 16min

Open Source Underpins A Home Furnishings Provider’s Global Ambitions

Wayfair describes itself as the “the destination for all things home: helping everyone, anywhere create their feeling of home.” It provides an online platform to acquire home furniture, outdoor decor and other furnishings. It also supports its suppliers so they can use the platform to sell their home goods, explained Natali Vlatko, global lead, open source program office (OSPO) and senior software engineering manager, for Wayfair as the featured guest in Detroit during KubeCon + CloudNativeCon North America 2022. “It takes a lot of technical, technical work behind the scenes to kind of get that going,” Vlatko said. This is especially true as Wayfair scales its operations worldwide. The infrastructure must be highly distributed, relying on containerization, microservices, Kubernetes, and especially, open source to get the job done. “We have technologists throughout the world, in North America and throughout Europe as well,”  Vlatko said. “And we want to make sure that we are utilizing cloud native and open source, not just as technologies that fuel our business, but also as the ways that are great for us to work in now.” Open source has served as a “great avenue” for creating and offering technical services, and to accomplish that, Vlatko amassed the requite tallent, she said. Vlatko was able to amass a small team of engineers to focus on platform work, advocacy, community management and internally on compliance with licenses. About five years ago when Vlatko joined Wayfair, the company had yet to go “full tilt into going all cloud native,”  Vlatko said. Wayfair had a hybrid mix of on-premise and cloud infrastructure. After decoupling from a monolith into a microservices architecture “that journey really began where we understood the really great benefits of microservices and got to a point where we thought, ‘okay, this hybrid model for us actually would benefit our microservices being fully in the cloud,” Vlatko said. In late 2020, Wayfair had made the decision to “get out of the data centers” and shift operations to the cloud, which was completed in October, Vlatko said.  The company culture is such that engineers have room to experiment without major fear of failure by doing a lot of development work in a sandbox environment. “We've been able to create production environments that are close to our production environments so that experimentation in sandboxes can occur. Folks can learn as they go without actually fearing failure or fearing a mistake,”  Vlatko said. “So, I think experimentation is a really important aspect of our own learning and growth for cloud native. Also, coming to great events like KubeCon + CloudNativeCon and other events [has been helpful]. We're hearing from other companies who've done the same journey and process and are learning from the use cases.”
undefined
Nov 30, 2022 • 16min

ML Can Prevent Getting Burned For Kubernetes Provisioning

In the rush to create, provision and manage Kubernetes, often left out is proper resource provisioning. According to StormForge, a company paying, for example, a million dollars a month on cloud computing resources is likely wasting $6 million a year of resources on the cloud on Kubernetes that are left unused. The reasons for this are manifold and can vary. They include how DevOps teams can tend to estimate too conservatively or aggressively or overspend on resource provisioning. In this podcast with StormForge’s Yasmin Rajabi, vice president of product management, and Patrick Bergstrom CTO, we look at how to properly provision Kubernetes resources and the associated challenges. The podcast was recorded live in Detroit during KubeCon + CloudNativeCon Europe 2022. Rethinking Web Application Firewalls  Almost ironically, the most commonly used Kubernetes resources can even complicate the ability to optimize resources for applications.The processes typically involve Kubernetes resource requests and limits, and predicting how the resources might impact quality of service for pods. Developers deploying an application on Kubernetes often need to set CPU-request, memory-request and other resource limits. “They are usually like ‘I don't know — whatever was there before or whatever the default is,’” Rajabi said. “They are in the dark.”  Sometimes, developers might use their favorite observability tool and say “‘we look where the max is, and then take a guess,’” Rajabi said. “The challenge is, if you start from there when you start to scale that out — especially for organizations that are using horizontal scaling with Kubernetes — is that then you're taking that problem and you're just amplifying it everywhere,” Rajabi said. “And so, when you've hit that complexity at scale, taking a second to look back and ‘say, how do we fix this?’ you don't want to just arbitrarily go reduce resources, because you have to look at the trade off of how that impacts your reliability.” The process then becomes very hit or miss. “That's where it becomes really complex, when there are so many settings across all those environments, all those namespaces,” Rajabi said. “It's almost a problem that can only be solved by machine learning, which makes it very interesting.” But before organizations learn the hard way about not automating optimizing deployments and management of Kubernetes, many resources — and costs — are bared to waste. “It's one of those things that becomes a bigger and bigger challenge, the more you grow as an organization,” Bergstrom said. Many StormForge customers are deploying into thousands of namespaces and thousands of workloads. “You are suddenly trying to manage each workload individually to make sure it has the resources and the memory that it needs,” Bergstrom said. “It becomes a bigger and bigger challenge.” The process should actually be pain free, when ML is properly implemented. With StormForge’s partnership with Datadog, it is possible to apply ML to collect historical data, Bergstrom explained. “Then, within just hours of us deploying our algorithm into your environment, we have machine learning that's used two to three weeks worth of data to train that can then automatically set the correct resources for your application. This is  because we know what the application is actually using,” Bergstrom said. “We can predict the patterns and we know what it needs in order to be successful.”
undefined
Nov 29, 2022 • 27min

What’s the Future of Feature Management?

Feature management isn’t a new idea but lately it’s a trend that’s picked up speed. Analysts like Forrester and Gartner have cited adoption of the practice as being, respectively, “hot” and “the dominant approach to experimentation in software engineering.” A study released in November found that 60% of 1,000 software and IT professionals surveyed started using feature flags only in the past year, according to the report sponsored by LaunchDarkly, the feature management platform and conducted by Wakefield Research. At the heart of feature management are feature flags, which give organizations the ability to turn features on and off, without having to re-deploy an entire app. Feature flags allow organizations test new features, and control things like access to premium versions of a customer-facing service. An overall feature management practice that includes feature flags allows organizations “to release progressively any new feature to any segment of users, any environment, any cohort of customers in a controlled manner that really reduces the risk of each release,” said Ravi Tharisayi, senior director of product marketing at LaunchDarkly, in this episode of The New Stack Makers podcast. Tharisayi talked to The New Stack’s features editor, Heather Joslyn, about the future of feature management, on the eve of the company’s latest Trajectory user conference. This episode of Makers was sponsored by LaunchDarkly.Streamlining Management, Saving MoneyThe participants in the new survey worked at companies of at least 200 employees, and nearly all of them that use feature flags — 98%— said they believe they save their organizations money and demonstrate a return on investment. Furthermore, 70% said that their company views feature management as either a mission-critical or a high-priority investment. Fielding the annual survey, Tharisayi said, has offered a window into how organizations are using feature flags. Fifty-five percent of customers in the 2022 survey said they use feature flags as long-term operational controls — for API rate limiting, for instance, to prioritize certain API calls in high-traffic situations. The second most common use, the survey found — cited by 47% of users — was for entitlements, “managing access to different types of plans, premium plans versus other plans, for example,” Tharisayi said. “This is really a powerful capability because of this ability to allow product managers or other personas to manage who has access to certain features to certain plans, without having to have developers be involved,” he said. “Previously, that required a lot of developer involvement.”Experimentation, Metrics, Cultural ShiftsLaunchDarkly, Tharisayi said, has been investing in and improving its platform’s experimentation and measurement capabilities: “At the core of that is this notion that experimentation can be a lot more successful when it's tightly integrated to the developer workflow.” As an example, he pointed to CCP Games, makers of the gaming platform EVE Online, which serves millions of players. “They were recently thinking through how to evolve their recommendation engine, because they wanted this engine to recommend actions for their gamers that will hopefully increase their ultimate North Star metric,” its tracking of how much time gamers spend with their games. By using LaunchDarkly’s platform, CCP was able to run A/B tests and increase gamers’ session lengths and engagement. ”So that's the kind of capability that we think is going to be an increasing priority,” Tharisayi said. As feature management matures and standardizes, he said, he pointed to the adoption of DevOps as a model and cautionary tale. ”When it comes to cultural shifts, like DevOps or feature management that require teams to work in a different way, oftentimes there can be early success with a small team,” Tharisayi said “But then there can be some cultural and process barriers as you're trying to standardize to the team level and multi-team level, before figuring out the kinks in deploying it at an organization-wide level.” He added, “that's one of the trends that we observed a little bit in this survey, is that there are some cultural elements to getting success at scale, with something like feature management and the opportunity as an industry to support organizations as they're making that quest to standardize a practice like this, like any other cultural practice.” Check out the full episode for more on the survey and on what’s next for feature management.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app