In this podcast, Rohit Agarwal, CEO of Portkey.ai, discusses designing for forward compatibility in Gen AI. He explores advancements in generative AI and the importance of a tooling layer. The podcast also covers the evolving landscape of software engineering tools, the ops layer for Janet Away applications, challenges in using open-source models, the technology stack in app production, the importance of an AI gateway, and the challenges of open-sourcing a product.
Read more
AI Summary
Highlights
AI Chapters
Episode notes
auto_awesome
Podcast summary created with Snipd AI
Quick takeaways
The importance of an AI gateway in handling error scenarios, load balancing, and routing LLM calls to different models for improved performance and reliability.
The significance of data security and compliance in deploying LLMs, including PII anonymization, data residency, and implementing measures to safeguard against unauthorized access and potential financial loss.
The need for forward compatibility in adopting LLM technology, achieved through the use of an AI gateway to facilitate phased testing, gradual deployment, and seamless transitions between different LLM models and prompts while mitigating the risk of investing in outdated models.
Deep dives
High-Level Overview of ML Ops Community Podcast
The podcast episode features an interview with Rohit Agarwal, the founder of Port Quirodei, discussing their approach to ML Ops and the importance of production-grade LLMs. They explore topics like the AI gateway, which is a central component for routing LLM calls, and the four pillars it should have. They also touch on forward compatibility, the challenges of data security and compliance, and the decoupling of the AI gateway from the code. The episode provides insights into building a complete ML Ops stack and the potential of open source contributions to the AI gateway.
Summarization, Q&A, and Generative Use Cases
Rohit explains that the three most common applications of LLMs in production are summarization, Q&A, and generative use cases. Summarization involves converting lengthy texts or videos into concise summaries, while Q&A focuses on using RAG to enable knowledge-based interactions. Generative use cases encompass tasks like article writing, sentiment analysis, and grammar correction. These applications are driving ROI for businesses and providing valuable insights to users.
Addressing Error Rates and Scaling Challenges
One of the main challenges with LLMs is dealing with error rates and scaling issues. Rohit discusses the need for an AI gateway to handle error scenarios, fallbacks, load balancing, and routing LLM calls to different models. He also emphasizes the importance of reducing error rates through evaluation, testing, and retry mechanisms. Furthermore, he acknowledges the scaling challenges faced by enterprises using open source models and the necessity for a robust tooling layer to handle spikes and troughs in traffic.
Data Anonymization, Data Residency, and DDoS Protection
The discussion delves into the crucial aspects of data security and compliance when deploying LLMs. Rohit highlights the need for PII anonymization to protect sensitive data and ensure compliance. He also explains the importance of data residency, especially when adhering to GDPR regulations. Additionally, he addresses the risks of DDoS attacks and the implementation of measures like Cloudflare solutions and virtual keys to safeguard against unauthorized access and potential financial loss.
The Concept of Forward Compatibility
Forward compatibility is identified as a major consideration when adopting LLM technology. Rohit describes it as the ability to keep up with evolving models, frameworks, and databases without disrupting existing production systems. He proposes the use of an AI gateway to facilitate phased testing, gradual deployment, and seamless transitions between different LLM models and prompts. The AI gateway acts as a decoupled layer, allowing rapid experimentation, evaluation, and adaptation of the LLM stack while mitigating the risk of investing in potentially outdated models.
The Challenges of Open Sourcing Decisions
Rohit discusses the challenges and considerations around open sourcing Port Quirodei's components. He explores the balance between providing value to the community and preserving the uniqueness of the product. The decision to open source the AI gateway, Rubyus, is mentioned, along with the potential for collaboration with other security and compliance-focused companies. The episode concludes with a reflection on the dynamic nature of the LLM landscape and the ongoing commitment to finding the best solutions for users and enterprises.
MLOps podcast #189 with Rohit Agarwal, CEO of Portkey.ai, Designing for Forward Compatibility in Gen AI.
// Abstract
For two whole years of working with a large LLM deployment, I always felt uncomfortable. How is my system performing? Are my users liking the outputs? Who needs help? Probabilistic systems can make this really hard to understand. In this talk, we'll discuss practical & implementable items to secure your LLM system and gain confidence while deploying to production.
// Bio
Rohit is the Co-founder and CEO of portkey.ai which is an FMOps stack for monitoring, model management, compliance, and more. Previously, he headed Product & AI at Pepper Content which has served ~900M generations on LLMs in production.
Having seen large LLM deployments in production, he's always happy to help companies build their infra stacks on FM APIs or Open-source models.
// MLOps Jobs board
https://mlops.pallet.xyz/jobs
// MLOps Swag/Merch
https://mlops-community.myshopify.com/
// Related Links
Website: https://portkey.ai
--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: https://mlops.community/
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Rohit on LinkedIn: https://www.linkedin.com/in/1rohitagarwal/
Timestamps:
[00:00] Rohit's preferred coffee
[00:15] Takeaways
[03:22] Please like, share, and subscribe to our MLOps channels!
[05:16] Rohit's current work
[06:37] The Portkey landscape
[09:13] Compute unit is no longer a Cloud resource, it's a Foundational Model
[11:09] Hang-ups at high-scale models and how to combat them
[15:22] Complexity of the Apps evolving
[19:54] Rohit's working relationships with the agents
[22:52] Fine-tuning reliability
[24:38] Small language models can outperform larger ones
[26:38] Market map at Portkey
[34:37] AI Gateway
[37:59] Worker Bee and Queen Bee
[39:27] Security and Compliance
[43:11] Idea of Data Mesh
[45:57] Forward compatibility
[49:59] Decoupling AI Gateway from the code
[56:05] Hardest design decisions to make since creating Portkey
[58:52] Wrap up
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode