Unpacking 3 Types of Feature Stores // Simba Khadder // #265

11 snips

Oct 1, 2024

Simba Khadder, the founder and CEO of Featureform and a machine learning expert, dives deep into the evolution of feature stores and their intersection with vector stores. He explains the significance of embeddings for recommender systems and discusses how personalization enhances user experiences with large language models. Simba also addresses the challenges in managing feature pipelines and the trade-offs between system complexity and reliability. Tune in to learn about the latest innovations shaping the MLOps landscape!

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Embeddings in Recommender Systems and LLMs

Embeddings power recommender systems and LLMs by representing users and items holistically.
This allows models to understand complex relationships and analogies from sparse data, like user-item interactions.

ANECDOTE

Coke Embeddings Analogy

Simba Khadder found a model derived an analogy between Coke, Diet Coke, Cherry Coke, and Coke Zero.
This demonstrates how embeddings can capture complex relationships just from user purchase data.

INSIGHT

Deep Learning vs. Traditional Models

University often overemphasizes deep learning, creating a gap between education and industry practices.
Many companies still rely on simpler models like XGBoost due to data limitations and practical considerations.

Get the Snipd Podcast app to discover more snips from this episode

Get the app

Simba Khadder is the Founder & CEO of Featureform. He started his ML career in recommender systems where he architected a multi-modal personalization engine that powered 100s of millions of user’s experiences. Unpacking 3 Types of Feature Stores // MLOps Podcast #265 with Simba Khadder, Founder & CEO of Featureform. // Abstract Simba dives into how feature stores have evolved and how they now intersect with vector stores, especially in the world of machine learning and LLMs. He breaks down what embeddings are, how they power recommender systems, and why personalization is key to improving LLM prompts. Simba also sheds light on the difference between feature and vector stores, explaining how each plays its part in making ML workflows smoother. Plus, we get into the latest challenges and cool innovations happening in MLOps. // Bio Simba Khadder is the Founder & CEO of Featureform. After leaving Google, Simba founded his first company, TritonML. His startup grew quickly and Simba and his team built ML infrastructure that handled over 100M monthly active users. He instilled his learnings into Featureform’s virtual feature store. Featureform turns your existing infrastructure into a Feature Store. He’s also an avid surfer, a mixed martial artist, a published astrophysicist for his work on finding Planet 9, and he ran the SF marathon in basketball shoes. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: featureform.comBigQuery Feature Store // Nicolas Mauti // MLOps Podcast #255: https://www.youtube.com/watch?v=NtDKbGyRHXQ&ab_channel=MLOps.community --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Simba on LinkedIn: https://www.linkedin.com/in/simba-k/ Timestamps: [00:00] Simba's preferred coffee [00:08] Takeaways [02:01] Coining the term 'Embedding' [07:10] Dual Tower Recommender System [10:06] Complexity vs Reliability in AI [12:39] Vector Stores and Feature Stores [17:56] Value of Data Scientists [20:27] Scalability vs Quick Solutions [23:07] MLOps vs LLMOps Debate [24:12] Feature Stores' current landscape [32:02] ML lifecycle challenges and tools [36:16] Feature Stores bundling impact [42:13] Feature Stores and BigQuery [47:42] Virtual vs Literal Feature Store [50:13] Hadoop Community Challenges [52:46] LLM data lifecycle challenges [56:30] Personalization in prompting usage [59:09] Contextualizing company variables [1:03:10] DSPy framework adoption insights [1:05:25] Wrap up