Real-time Feature Generation at Lyft // MLOps Podcast #334 with Rakesh Kumar, Senior Staff Software Engineer at Lyft.
Join the Community: https://go.mlops.community/YTJoinIn
Get the newsletter: https://go.mlops.community/YTNewsletter
// Abstract
This session delves into real-time feature generation at Lyft. Real-time feature generation is critical for Lyft where accurate up-to-the-minute marketplace data is paramount for optimal operational efficiency. We will explore how the infrastructure handles the immense challenge of processing tens of millions of events per minute to generate features that truly reflect current marketplace conditions.
Lyft has built this massive infrastructure over time, evolving from a humble start and a naive pipeline. Through lessons learned and iterative improvements, Lyft has made several trade-offs to achieve low-latency, real-time feature delivery. MLOps plays a critical role in managing the lifecycle of these real-time feature pipelines, including monitoring and deployment. We will discuss the practicalities of building and maintaining high-throughput, low-latency real-time feature generation systems that power Lyft’s dynamic marketplace and business-critical products.
// Bio
Rakesh Kumar is a Senior Staff Software Engineer at Lyft, specializing in building and scaling Machine Learning platforms. Rakesh has expertise in MLOps, including real-time feature generation, experimentation platforms, and deploying ML models at scale. He is passionate about sharing his knowledge and fostering a culture of innovation. This is evident in his contributions to the tech community through blog posts, conference presentations, and reviewing technical publications.
// Related Links
Website: https://englife101.io/
https://eng.lyft.com/search?q=rakesh
https://eng.lyft.com/real-time-spatial-temporal-forecasting-lyft-fa90b3f3ec24
https://eng.lyft.com/evolution-of-streaming-pipelines-in-lyfts-marketplace-74295eaf1eba
Streaming Ecosystem Complexities and Cost Management // Rohit Agrawal // MLOps Podcast #302 - https://youtu.be/0axFbQwHEh8
~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~
Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore
Join our Slack community [https://go.mlops.community/slack]
Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)]
Sign up for the next meetup: [https://go.mlops.community/register]
MLOps Swag/Merch: [https://shop.mlops.community/]
Connect with Demetrios on LinkedIn: /dpbrinkm
Connect with Rakesh on LinkedIn: /rakeshkumar1007/
Timestamps:
[00:00] Rakesh preferred coffee
[00:24] Real-time machine learning
[04:51] Latency tricks explanation
[09:28] Real-time problem evolution
[15:51] Config management complexity
[18:57] Data contract implementation
[23:36] Feature store
[28:23] Offline vs online workflows
[31:02] Decision-making in tech shifts
[36:54] Cost evaluation frequency
[40:48] Model feature discussion
[49:09] Hot shard tricks
[55:05] Pipeline feature bundling
[57:38] Wrap up