AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Coffee Sessions #49 with Stefan Krawczyk, Aggressively Helpful Platform Teams.
//Abstract
At Stitch Fix there are 130+ “Full Stack Data Scientists” who in addition to doing data science work, are also expected to engineer and own data pipelines for their production models. One data science team, the Forecasting, Estimation, and Demand team were in a bind. Their data generation process was causing them iteration & operational frustrations in delivering time-series forecasts for the business. the solution? Hamilton, a novel python micro-framework, solved their pain points by changing their working paradigm.
Some of the main workers on Hamilton are the dedicated engineering team called Data Platform. Data Platform builds services, tools, and abstractions to enable DS to operate in a full-stack manner avoiding hand-off. In the beginning, this meant DS built the web apps to serve model predictions, now as the layers of abstractions have been built over time, they still dictate what is deployed, but write much less code.
// Bio
Stefan loves the stimulus of working at the intersection of design, engineering, and data. He grew up in New Zealand, speaks Polish, and spent formative years at Stanford, LinkedIn, Nextdoor & Idibon. Outside of work in pre-covid times Stefan liked to 🏊, 🌮, 🍺, and ✈.
// Other Links
https://www.youtube.com/watch?v=B5Zp_30Knoo
https://www.slideshare.net/StefanKrawczyk/hamilton-a-micro-framework-for-creating-dataframes https://www.slideshare.net/StefanKrawczyk/deployment-for-free-removing-the-need-to-write-model-deployment-code-at-stitch-fix
--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/
Connect with Stefan on LinkedIn: https://linkedin.com/in/skrawczyk
Timestamps:
[00:00] Introduction to Stefan Krawczyk
[00:37] Why Hamilton?
[01:50] Stefan's background in tech
[04:15] Model Life Cycle Team
[06:48] Managing outcomes generated by data scientists
[09:04] Teams doing the same thing
[12:41] Vision of getting code down to zero
[18:40] Freedom and autonomy went wrong
[21:17] Sub teams
[24:00] Create and deploy models easily
[24:28] Interesting challenge to define
[25:15] Stitch Fix Model productionization to be proud of
[26:23] Hamilton to open-source
[28:45] Model Envelope
[31:45] Deployment for free
[34:53] Use of Model Envelope in Model Artifact
[37:16] Extending API definition in a model envelope for the model
[39:00] Dependencies [40:08] Monitoring at scale
[43:43] Advice in terms of neat abstraction
[46:19] Envelope vs Container
[47:33] Time frame of Hamilton's development and its benefits