Talk Python To Me

#511: From Notebooks to Production Data Science Systems

40 snips
Jun 25, 2025
Catherine Nelson, a self-employed data scientist and author of 'Software Engineering for Data Scientists,' discusses vital techniques for transitioning from local data science notebooks to robust production workflows. She shares insights on effective coding practices, the challenges of machine learning integration, and organizing Python projects for scalability. Additionally, Catherine highlights the dual nature of notebooks, emphasizing their role in project exploration versus production needs. Her personal journey reflects a rich intersection between software engineering principles and data science.
Ask episode
AI Snips
Chapters
Books
Transcript
Episode notes
ANECDOTE

From Geologist to Data Scientist

  • Catherine Nelson shared her personal journey from geology and MATLAB to data science.
  • She was inspired to improve her coding to do more data science faster after advice from a software developer teammate.
ADVICE

Adopt Software Engineering Mindsets

  • Data scientists should adopt software engineering mindsets like testing, version control, and refactoring.
  • These strategies help create robust, reproducible, and shareable code for production environments.
ADVICE

Break Down Notebook Code

  • Break down notebooks into discrete steps before refactoring into functions.
  • Analyze inputs and outputs carefully to ensure proper data flow and function integration.
Get the Snipd Podcast app to discover more snips from this episode
Get the app