AI Engineering Podcast cover image

AI Engineering Podcast

Latest episodes

undefined
Jan 22, 2025 • 1h 3min

Optimize Your AI Applications Automatically With The TensorZero LLM Gateway

SummaryIn this episode of the AI Engineering podcast Viraj Mehta, CTO and co-founder of TensorZero, talks about the use of LLM gateways for managing interactions between client-side applications and various AI models. He highlights the benefits of using such a gateway, including standardized communication, credential management, and potential features like request-response caching and audit logging. The conversation also explores TensorZero's architecture and functionality in optimizing AI applications by managing structured data inputs and outputs, as well as the challenges and opportunities in automating prompt generation and maintaining interaction history for optimization purposes.AnnouncementsHello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsSeamless data integration into AI applications often falls short, leading many to adopt RAG methods, which come with high costs, complexity, and limited scalability. Cognee offers a better solution with its open-source semantic memory engine that automates data ingestion and storage, creating dynamic knowledge graphs from your data. Cognee enables AI agents to understand the meaning of your data, resulting in accurate responses at a lower cost. Take full control of your data in LLM apps without unnecessary overhead. Visit aiengineeringpodcast.com/cognee to learn more and elevate your AI apps and agents. Your host is Tobias Macey and today I'm interviewing Viraj Mehta about the purpose of an LLM gateway and his work on TensorZeroInterviewIntroductionHow did you get involved in machine learning?What is an LLM gateway?What purpose does it serve in an AI application architecture?What are some of the different features and capabilities that an LLM gateway might be expected to provide?Can you describe what TensorZero is and the story behind it?What are the core problems that you are trying to address with Tensor0 and for whom?One of the core features that you are offering is management of interaction history. How does this compare to the "memory" functionality offered by e.g. LangChain, Cognee, Mem0, etc.?How does the presence of TensorZero in an application architecture change the ways that an AI engineer might approach the logic and control flows in a chat-based or agent-oriented project?Can you describe the workflow of building with Tensor0 and some specific examples of how it feeds back into the performance/behavior of an LLM?What are some of the ways in which the addition of Tensor0 or another LLM gateway might have a negative effect on the design or operation of an AI application?What are the most interesting, innovative, or unexpected ways that you have seen TensorZero used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on TensorZero?When is TensorZero the wrong choice?What do you have planned for the future of TensorZero?Contact InfoLinkedInParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing AnnouncementsThank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.LinksTensorZeroLLM GatewayLiteLLMOpenAIGoogle VertexAnthropicReinforcement LearningTokamak ReactorViraj RLHF PaperContextual Dueling BanditsDirect Preference OptimizationPartially Observable Markov Decision ProcessDSPyPyTorchCogneeMem0LangGraphDouglas HofstadterOpenAI GymOpenAI o1OpenAI o3Chain Of ThoughtThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0
undefined
Dec 16, 2024 • 55min

Harnessing The Engine Of AI

Ron Green, co-founder and CTO of Kung Fu AI, dives into the evolving AI landscape and the complexities of generative AI engines. He discusses the limitations of large language models and the critical need for human oversight and robust data management. Ron highlights innovative methods like Retrieval-Augmented Generation and the significance of targeted, domain-specific AI solutions. He expresses optimism for AI's future, predicting major advancements in the next 20 years that integrate seamlessly into everyday applications.
undefined
Dec 1, 2024 • 54min

The Complex World of Generative AI Governance

Jim Olson, CTO of ModelOp, specializes in generative AI governance and regulations. He discusses the importance of monitoring and inventory for compliance in high-risk areas like healthcare. Olson emphasizes the need for technical controls to manage data governance and the continuous monitoring of AI models to detect issues. He addresses the balance between innovation and regulation, particularly in light of evolving EU regulations, and highlights the necessity of building trust through effective governance solutions.
undefined
8 snips
Nov 25, 2024 • 55min

Building Semantic Memory for AI With Cognee

Vasilije Markovich, a data engineer and AI specialist from Montenegro, discusses enhancing large language models with memory. He highlights the challenges of context window limitations and forgetting in LLMs, introducing hierarchical memory to improve performance. Vasilije dives into his creation, Cognee, which manages semantic memory, emphasizing its potential applications and the blend of cognitive science with data engineering. He shares insights from building an AI startup, the importance of user feedback, and future developments in open-source AI technology.
undefined
17 snips
Nov 22, 2024 • 53min

The Impact of Generative AI on Software Development

Tanner Burson, VP of Engineering at Prismatic, dives into the transformative effects of generative AI on software development. He discusses how AI is reshaping developer roles and productivity, fueled by tools like GitHub's Copilot. Tanner outlines both the opportunities and challenges AI presents, emphasizing the crucial need for human oversight to ensure code quality. He also explores the microunits of AI integration in workflows, the growing importance of mentorship, and the balance between innovation and practical engineering skills in an AI-driven future.
undefined
9 snips
Nov 11, 2024 • 1h 16min

ML Infrastructure Without The Ops: Simplifying The ML Developer Experience With Runhouse

Donnie Greenberg, Co-founder and CEO of Runhouse and former product lead for PyTorch at Meta, shares insights on simplifying machine learning infrastructure. He discusses the challenges of traditional MLOps tools and presents Runhouse's serverless approach that reduces complexity in moving from development to production. Greenberg emphasizes the importance of flexible, collaborative environments and innovative fault tolerance in ML workflows. He also touches on the need for integration with existing DevOps practices to meet the evolving demands of AI and ML.
undefined
11 snips
Nov 11, 2024 • 54min

Building AI Systems on Postgres: An Inside Look at pgai Vectorizer

Avthar Sewrathan, Head of AI at Timescale and expert in database infrastructure, shares insights into the innovative pgai Vectorizer toolchain. He reveals how this tool enables seamless management of AI workflows in Postgres, emphasizing the importance of keeping vector data updated. The discussion covers optimizing embedding strategies, the balance between user-friendliness and customization for developers, and the future of AI integration within databases. Avthar also touches on challenges in content moderation and semantic search, highlighting the need for continuous improvement and collaboration in the open-source community.
undefined
31 snips
Oct 28, 2024 • 58min

Running Generative AI Models In Production

Philip Kiely, an AI infrastructure expert at BaseTen, dives into the complexities of running generative AI models in production. He shares insights on the importance of selecting the right model based on product requirements and discusses key deployment strategies, including architecture and performance monitoring. Challenges like model quantization and the balance between open-source and proprietary models are explored. Philip also highlights future trends such as local inference, emphasizing the need for compliance in sectors like healthcare.
undefined
Sep 10, 2024 • 59min

Enhancing AI Retrieval with Knowledge Graphs: A Deep Dive into GraphRAG

Philip Rathle, CTO of Neo4J and an expert in knowledge graphs, dives deep into how GraphRAG revolutionizes AI retrieval systems. He explains how this innovative method blends knowledge graphs with vector similarity for clearer, more accurate AI outputs. Rathle discusses the technical aspects of data modeling and the importance of structured data in addressing traditional retrieval challenges. The conversation also touches on real-world applications of GraphRAG across various industries, highlighting its potential to transform AI interactions.
undefined
Sep 2, 2024 • 42min

Harnessing Generative AI for Effective Digital Advertising Campaigns

SummaryIn this episode of the AI Engineering podcast Praveen Gujar, Director of Product at LinkedIn, talks about the applications of generative AI in digital advertising. He highlights the key areas of digital advertising, including audience targeting, content creation, and ROI measurement, and delves into how generative AI is revolutionizing these aspects. Praveen shares successful case studies of generative AI in digital advertising, including campaigns by Heinz, the Barbie movie, and Maggi, and discusses the potential pitfalls and risks associated with AI-powered tools. He concludes with insights into the future of generative AI in digital advertising, highlighting the importance of cultural transformation and the synergy between human creativity and AI.AnnouncementsHello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsYour host is Tobias Macey and today I'm interviewing Praveen Gujar about the applications of generative AI in digital advertisingInterviewIntroductionHow did you get involved in machine learning?Can you start by defining "digital advertising" for the scope of this conversation?What are the key elements/characteristics/goals of digital avertising?In the world before generative AI, what did a typical end-to-end advertising campaign workflow look like?What are the stages of that workflow where generative AI are proving to be most useful?How do the current limitations of generative AI (e.g. hallucinations, non-determinism) impact the ways in which they can be used?What are the technological and organizational systems that need to be implemented to effectively apply generative AI in public-facing applications that are so closely tied to brand/company image?What are the elements of user education/expectation setting that are necessary when working with marketing/advertising personnel to help avoid damage to the brands?What are some examples of applications for generative AI in digital advertising that have gone well?Any that have gone wrong?What are the most interesting, innovative, or unexpected ways that you have seen generative AI used in digital advertising?What are the most interesting, unexpected, or challenging lessons that you have learned while working on digital advertising applications of generative AI?When is generative AI the wrong choice?What are your future predictions for the use of generative AI in dgital advertising?Contact InfoWebsiteLinkedInParting QuestionFrom your perspective, what is the biggest barrier to adoption of machine learning today?Closing AnnouncementsThank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.LinksGenerative AILLM == Large Language ModelDall-E)RLHF == Reinforcement Learning fHuman FeedbackThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode