The New Stack Podcast cover image

The New Stack Podcast

Latest episodes

undefined
Mar 28, 2024 • 26min

LLM Observability: The Breakdown

LLM observability focuses on maximizing the utility of larger language models (LLMs) by monitoring key metrics and signals. Alex Williams, Founder and Publisher for The New Stack, and Janikiram MSV, Principal of Janikiram & Associates and an analyst and writer for The New Stack, discusses the emergence of the LLM stack, which encompasses various components like LLMs, vector databases, embedding models, retrieval systems, read anchor models, and more. The objective of LLM observability is to ensure that users can extract desired outcomes effectively from this complex ecosystem.Similar to infrastructure observability in DevOps and SRE practices, LLM observability aims to provide insights into the LLM stack's performance. This includes monitoring metrics specific to LLMs, such as GPU/CPU usage, storage, model serving, change agents in applications, hallucinations, span traces, relevance, retrieval models, latency, monitoring, and user feedback. MSV emphasizes the importance of monitoring resource usage, model catalog synchronization with external providers like Hugging Face, vector database availability, and the inference engine's functionality.He also mentions peer companies in the LLM observability space like Datadog, New Relic, Signoz, Dynatrace, LangChain (LangSmith), Arize.ai (Phoenix), and Truera, hinting at a deeper exploration in a future episode of The New Stack Makers. Learn more from The New Stack about LLM and observability  Observability in 2024: More OpenTelemetry, Less Confusion How AI Can Supercharge Observability Next-Gen Observability: Monitoring and Analytics in Platform Engineering Join our community of newsletter subscribers to stay on top of the news and at the top of your game.   
undefined
Mar 21, 2024 • 39min

Why Software Developers Should Be Thinking About the Climate

In a conversation on The New Stack Makers, co-hosted by Alex Williams, TNS founder and publisher, and Charles Humble, an industry expert who served as a software engineer, architect and CTO and now podcaster, author and consultant at Conissaunce Ltd., discussed why software developers and engineers should care about their impact on climate change. Humble emphasized that building software sustainably starts with better operations, leading to cost savings and improved security. He cited past successes in combating environmental issues like acid rain and the ozone hole through international agreements and emissions reduction strategies.Despite modest growth since 2010, data centers remain significant electricity consumers, comparable to countries like Brazil. The power-intensive nature of AI models exacerbates these challenges and may lead to scarcity issues. Humble mentioned the Green Software Foundation's Maturity Matrix with goals for carbon-free data centers and longer device lifespans, discussing their validity and the role of regulation in achieving them. Overall, software development's environmental impact, primarily carbon emissions, necessitates proactive measures and industry-wide collaboration. Learn more from The New Stack about sustainability: What is GreenOps? Putting a Sustainable Focus on FinOpsUnraveling the Costs of Bad Code in Software Development Can Reducing Cloud Waste Help Save the Planet?How to Build Open Source Sustainability Join our community of newsletter subscribers to stay on top of the news and at the top of your game.
undefined
23 snips
Mar 14, 2024 • 40min

Nvidia’s Superchips for AI: ‘Radical,’ but a Work in Progress

This New Stack Makers podcast co-hosted by Alex Williams, TNS founder and publisher, and Adrian Cockcroft, Partner and Analyst at OrionX.net, discussed Nvidia's GH200 Grace Hopper superchip. Industry expert Sunil Mallya, Co-founder and CTO of Flip AI weighed in on how it is revolutionizing the hardware industry for AI workloads by centralizing GPU communication, reducing networking overhead, and creating a more efficient system. Mallya noted that despite its innovative design, challenges remain in adoption due to interface issues and the need for software to catch up with hardware advancements. However, optimism persists for the future of AI-focused chips, with Nvidia leading the charge in creating large-scale coherent memory systems. Meanwhile, Flip AI, a DevOps large language model, aims to interpret observability data to troubleshoot incidents effectively across various cloud platforms. While discussing the latest chip innovations and challenges in training large language models, the episode sheds light on the evolving landscape of AI hardware and software integration.Learn more from The New Stack about Nvidia and the future of chip design  Nvidia Wants to Rewrite the Software Development Stack Nvidia GPU Dominance at a Crossroads Join our community of newsletter subscribers to stay on top of the news and at the top of your game.  
undefined
Mar 7, 2024 • 30min

Is GitHub Copilot Dependable? These Demos Aren’t Promising

Joan Westenberg, founder of Joan's Index, discusses GitHub Copilot's potential to streamline tasks but highlights its limitations in reliability during tests. Despite assisting with coding, it struggles with complex projects. Westenberg's demos reveal the strengths and weaknesses of Copilot, emphasizing the need for improvement to fulfill its promise as a versatile work companion.
undefined
Feb 28, 2024 • 27min

The New Monitoring for Services That Feed from LLMs

This New Stack Makers podcast co-hosted by Adrian Cockroft, analyst at OrionX.net and TNS founder and publisher, Alex Williams discusses the importance of monitoring services utilizing Large Language Models (LLMs) and the emergence of tools like LangChain and LangSmith to address this need. Adrian Cockcroft, formerly of Netflix and now working with The New Stack, highlights the significance of monitoring AI apps using LLMs and the challenges posed by slow and expensive API calls from LLMs. LangChain acts as middleware, connecting LLMs with services, akin to the Java Database Controller. LangChain's monitoring capabilities led to the development of LangSmith, a monitoring tool. Another tool, LangKit by WhyLabs, offers similar functionalities but is less integrated. This reflects the typical evolution of open-source projects into commercial products. LangChain recently secured funding, indicating growing interest in such monitoring solutions. Cockcroft emphasizes the importance of enterprise-level support and tooling for integrating these solutions into commercial environments. This discussion  underscores the evolving landscape of monitoring services powered by LLMs and the emergence of specialized tools to address associated challenges. Learn more from The New Stack about LangChain: LangChain: The Trendiest Web Framework of 2023, Thanks to AI How Retool AI Differs from LangChain (Hint: It's Automation)  Join our community of newsletter subscribers to stay on top of the news and at the top of your game.  
undefined
Feb 7, 2024 • 19min

How Platform Engineering Supports SRE

In this New Stack Makers podcast, Martin Parker, a solutions architect for UST, spoke with TNS editor-in-chief, Heather Joslyn and discussed the significance of internal developer platforms (IDPs), emphasizing benefits beyond frontend developers to backend engineers and site reliability engineers (SREs). Parker highlighted the role of IDPs in automating repetitive tasks, allowing SREs to focus on optimizing application performance. Standardization is key, ensuring observability and monitoring solutions align with best practices and cater to SRE needs. By providing standardized service level indicators (SLIs) and key performance indicators (KPIs), IDPs enable SREs to maintain reliability efficiently. Parker stresses the importance of avoiding siloed solutions by establishing standardized practices and tools for effective monitoring and incident response. Overall, the deployment of IDPs aims to streamline operations, reduce incidents, and enhance organizational value by empowering SREs to concentrate on system maintenance and improvements.Learn more from The New Stack about UST: Cloud Cost-Unit Economics- A Modern Profitability Model Cloud Native Users Struggle to Achieve Benefits, Report Says John our community of newsletter subscribers to stay on top of the news and at the top of your game. 
undefined
Jan 31, 2024 • 15min

Internal Developer Platforms: Helping Teams Limit Scope

In this New Stack Makers podcast, Ben Wilcock, a senior technical marketing architect for Tanzu, spoke with TNS editor-in-chief, Heather Joslyn and discussed the challenges organizations face when building internal developer platforms, particularly the issue of scope, at KubeCon + CloudNativeCon North America. He emphasized the difficulty for platform engineering teams to select and integrate various Kubernetes projects amid a plethora of options. Wilcock highlights the complexity of tracking software updates, new features, and dependencies once choices are made. He underscores the advantage of having a standardized approach to software deployment, preventing errors caused by diverse mechanisms. Tanzu aims to simplify the adoption of platform engineering and internal developer platforms, offering a turnkey approach with the Tanzu Application Platform. This platform is designed to be flexible, malleable, and functional out of the box. Additionally, Tanzu has introduced the Tanzu Developer Portal, providing a focal point for developers to share information and facilitating faster progress in platform engineering without the need to integrate numerous open source projects. Learn more from The New Stack about Tanzu and internal developer platforms:VMware Unveils a Pile of New Data Services for Its Cloud VMware VMware Expands Tanzu into a Full Platform Engineering Environment VMware Targets the Platform Engineer Join our community of newsletter subscribers to stay on top of the news and at the top of your game. 
undefined
Jan 23, 2024 • 15min

How the Kubernetes Gateway API Beats Network Ingress

In this New Stack Makers podcast, Mike Stefaniak, senior product manager at NGINX and Kate Osborn, a software engineer at NGINX discusses challenges associated with network ingress in Kubernetes clusters and introduces the Kubernetes Gateway API as a solution. Stefaniak highlights the issues that arise when multiple teams work on the same ingress, leading to friction and incidents. NGINX has also introduced the NGINX Gateway Fabric, implementing the Kubernetes Gateway API as an alternative to network ingress. The Kubernetes Gateway API, proposed four years ago and recently made generally available, offers advantages such as extensibility. It allows referencing policies with custom resource definitions for better validation, avoiding the need for annotations. Each resource has an associated role, enabling clean application of role-based access control policies for enhanced security.While network ingress is prevalent and mature, the Kubernetes Gateway API is expected to find adoption in greenfield projects initially. It has the potential to unite North-South and East-West traffic, offering a role-oriented API for comprehensive control over cluster traffic. The article encourages exploring the Kubernetes Gateway API and engaging with the community to contribute to its development.Learn more from The New Stack about NGINX and the open source Kubernetes Gateway API:Kubernetes API Gateway 1.0 Goes Live, as Maintainers Plan for The Future API Gateway, Ingress Controller or Service Mesh: When to Use What and Why Ingress Controllers or the Kubernetes Gateway API? Which is Right for You?  Join our community of newsletter subscribers to stay on top of the news and at the top of your game.    
undefined
Jan 17, 2024 • 25min

What You Can Do with Vector Search

TNS publisher Alex Williams spoke with Ben Kramer, co-founder and CTO of Monterey.ai Cole Hoffer, Senior Software Engineer at Monterey.ai to discuss how the company utilizes vector search to analyze user voices, feedback, reviews, bug reports, and support tickets from various channels to provide product development recommendations. Monterey.ai connects customer feedback to the development process, bridging customer support and leadership to align with user needs. Figma and Comcast are among the companies using this approach. In this interview, Kramer discussed the challenges of building Large Language Model (LLM) based products and the importance of diverse skills in AI web companies and how Monterey employs Zilliz for vector search, leveraging Milvus, an open-source vector database. Kramer highlighted Zilliz's flexibility, underlying Milvus technology, and choice of algorithms for semantic search. The decision to choose Zilliz was influenced by its performance in the company's use case, privacy and security features, and ease of integration into their private network. The cloud-managed solution and Zilliz's ability to meet their needs were crucial factors for Monterey AI, given its small team and preference to avoid managing infrastructure.Learn more from The New Stack about Zilliz and vector database search:Improving ChatGPT’s Ability to Understand Ambiguous PromptsCreate a Movie Recommendation Engine with Milvus and PythonUsing a Vector Database to Search White House Speeches Join our community of newsletter subscribers to stay on top of the news and at the top of your game. https://thenewstack.io/newsletter/ 
undefined
Jan 10, 2024 • 16min

How Ethical Hacking Tricks Can Protect Your APIs and Apps

TNS host Heather Joslyn sits down with Ron Masas to discuss trade-offs when it comes to creating fast, secure applications and APIs. He notes a common issue of neglecting documentation and validation, leading to vulnerabilities. Weak authorization is a recurring problem, with instances where changing an invoice ID could expose another user's data.Masas, an ethical hacker, highlights the risk posed by "zombie" APIs—applications that have become disused but remain potential targets. He suggests investigating frameworks, checking default configurations, and maintaining robust logging to enhance security. Collaboration between developers and security teams is crucial, with "security champions" in development teams and nuanced communication about vulnerabilities from security teams being essential elements for robust cybersecurity.For further details, the podcast discusses case studies involving TikTok and Digital Ocean, Masas's views on AI and development, and anticipated security challenges.Learn more from The New Stack about Imperva and API security:What Developers Need to Know about Business Logic AttacksWhy Your APIs Aren’t Safe — and What to Do about ItThe Limits of Shift-Left: What’s Next for Developer Security Join our community of newsletter subscribers to stay on top of the news and at the top of your game.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app