
The Real Python Podcast Preparing Data Science Projects for Production
30 snips
Nov 14, 2025 Khuyen Tran, an author and data science practitioner from CodeCut, shares her insights on preparing Python projects for production. She discusses her journey into blogging and the motivation behind her book, "Production Ready Data Science." Key topics include common pitfalls of notebooks and the importance of reproducible workflows. Khuyen advocates for using modular coding practices and the benefits of tools like Polars and marimo notebooks for efficiency. She also emphasizes the significance of version control and proper testing in data science projects.
AI Snips
Chapters
Books
Transcript
Episode notes
Modularize Code And Use Config Files
- Extract reusable code from notebooks into Python modules to make it importable and maintainable.
- Put configuration values in config files so tests and code remain clean and adjustable.
Startup Friction Motivated The Book
- Khuyen described being a data scientist at a startup where engineers rewrote scientists' code for production.
- That experience motivated her to teach cleaner, importable code practices in her book.
Notebooks Aren't Long-Term Source
- Notebooks often become unreadable and unreproducible when cells run out of order or contain ad-hoc state.
- Treat notebooks as prototyping tools and move stable logic into scripts for reliability.


