The Future of Feature Stores and Platforms // Mike Del Balso & Josh Wills // # 186
Oct 31, 2023
auto_awesome
Mike Del Balso, CEO of Tecton, and Josh Wills, Angel Investor, discuss the importance of feature stores and platforms for ML applications. They explore challenges in operationalizing data pipelines, integrating offline and online data stores, and templatizing workflows for fraud detection and recommendation systems. The concept of gamification in feature creation is also explored.
Feature stores play a vital role in ML operations by ensuring efficient data access, reusability, and consistency for multiple ML use cases.
Building and scaling feature platforms can be challenging due to complexities in integrating with existing data sources and defining diverse sets of features, requiring specialized online serving layers and data retrieval systems for optimized performance at scale.
Feature platforms are crucial for real-time decision making, providing capabilities to process streaming data, calculate real-time features, and serve them instantly for fraud detection, personalization, and recommendation systems.
Deep dives
The Importance of Feature Stores in ML Operations
Feature stores play a vital role in ML operations by storing and serving the data needed for model training and inference. They ensure efficient data access, reusability, and consistency for multiple ML use cases.
Challenges in Building and Scaling Feature Platforms
Building and scaling feature platforms can be challenging due to complexities in integrating with existing data sources and defining diverse sets of features. Specialized online serving layers and data retrieval systems are needed for optimized performance at scale.
The Role of Feature Platforms in Real-time Decision Making
Feature platforms are becoming increasingly important in enabling real-time decision making. They provide capabilities to process streaming data, calculate real-time features, and serve them instantaneously for fraud detection, personalization, and recommendation systems.
The importance of scaling ML platform teams
One key insight from the podcast is the challenge faced by ML platform teams when scaling to meet customer demand. The example of a conversation with an ML platform lead is provided, where they express concerns about the team's capacity to handle the increasing workload. The primary issue highlighted is the lack of leverage and scalability due to excessive manual work for each customer. The podcast emphasizes the need for finding ways to enable and get more value and leverage from existing platform team members, as significant headcount growth may not be feasible.
The problem of fragmented feature infrastructure
Another main point discussed in the podcast is the issue of fragmented feature infrastructure in companies. It is explained how different business units and platform teams develop their own feature platforms, resulting in a multitude of specialized systems that do not work well together. This leads to challenges in building and serving ML models in production efficiently and effectively. The podcast highlights the need for unified feature infrastructure and the potential benefits it can bring, such as improved training and serving of models, better product recommendation systems, and enhanced fraud detection capabilities.
MLOps podcast #186 with Mike Del Balso, CEO & Co-founder of Tecton and Josh Wills, Angel Investor, The Future of Feature Stores and Platforms.
// Abstract
Mike and Josh discuss creating templates and working at a detailed level, exploring Tecton's potential for sharing fraud and third-party features. They focus on technical aspects like data handling and optimizing models, emphasizing the significance of quality data for AI systems and the necessity for cohesive feature infrastructure in reaching production stages.
// Bio
Mike Del Balso
Mike is the co-founder of Tecton, where he is focused on building next-generation data infrastructure for Operational ML. Before Tecton, Mike was the PM lead for the Uber Michelangelo ML platform. He was also a product manager at Google where he managed the core ML systems that power Google’s Search Ads business.
Josh Wills
Josh Wills is an angel investor specializing in data and machine learning infrastructure. He was formerly the head of data engineering at Slack, the director of data science at Cloudera, and a software engineer at Google.
// MLOps Jobs board
https://mlops.pallet.xyz/jobs
// MLOps Swag/Merch
https://mlops-community.myshopify.com/
// Related Links
--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: https://mlops.community/
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Mike on LinkedIn: https://www.linkedin.com/in/michaeldelbalso/Connect with Josh on LinkedIn: https://www.linkedin.com/in/josh-wills-13882b/
Timestamps:
[00:00] Introduction to Mike
[01:45] Takeaways
[03:32] Features of the new paradigm of ML and LLMs
[06:00] D. Sculley's papers
[13:05] The birth of Feature Store
[17:06] Data Pipeline Challenges Addressed
[20:00] Operationalizing
[26:50] Feature Store Challenges
[30:26] Z access
[36:23] Addressing Technical Debt Challenges
[37:27] Real-Time vs. Batch Processing
[47:10] Feature Store Evolution: Apache Iceberg
[49:59] Feature Platform: Dedicated Query Engine
[54:04] The bottleneck
[56:00] LLMs, Feature Stores Overview
[1:00:20] Vector databases
[1:06:15] Workflow Templating Efficiency
[1:08:35] Gamification suggestion for Tecton
[1:10:25] Wrap up
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode