
Vanishing Gradients Episode 25: Fully Reproducible ML & AI Workflows
13 snips
Mar 18, 2024 Omoju Miller, a machine learning expert and CEO of Fimio, shares her vision for transparent and reproducible ML workflows. She discusses the necessity of open tools and data in combating the monopolization of tech by closed-source APIs. Topics include the evolution of developer tools, the importance of data provenance, and the potential of a collaborative open compute ecosystem. Omoju also emphasizes user accessibility in machine learning and envisions a future where everyone can build production-ready applications with ease.
AI Snips
Chapters
Transcript
Episode notes
Package Everything For Reproducibility
- Reproducibility requires packaging code, dependencies, machine spec, data and outputs together.
- Make it trivial to rerun a paper's experiment against your dataset before productionizing.
ML Needs Scientific Rigor
- ML development is fundamentally a scientific workflow that needs methodical, repeatable experiments.
- Treat ML like applied science, not just software hacking.
Version Prompts And Retrieval Pipelines
- Version and record prompts, retrieval context and prompt chains as part of experiment tracking.
- Treat prompt engineering, RAG and retrieval as first-class artifacts to audit and roll back.
