MLOps.community

MLOps Coffee Sessions #14 Conversation with the Creators of Dask // Hugo Bowne-Anderson and Matthew Rocklin

4 snips
Oct 12, 2020
Hugo Bowne-Anderson and Matthew Rocklin, co-founders of Coiled, are reshaping the data science landscape. They dive into Dask, the open-source library that optimizes parallel computing for Python, making it easier to handle large datasets. The duo discusses the challenges of scaling data science, navigating cloud complexities, and the vital role of data literacy in organizations. They also share insights on community engagement in open source, the evolution of OSS, and the advantages of Dask over tools like Spark, emphasizing its future in distributed computing.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Dask's Origin

  • Dask was initially designed as a parallel NumPy at Anaconda to scale Python's data science tools.
  • It evolved to become a general-purpose parallel computing library after other libraries adopted its core engine.
INSIGHT

Data Science's Difficulty

  • Data science is difficult partly because it isn't a single, unified field.
  • The tools, methods, and desired outcomes vary greatly between applications, from distributed machine learning to analytics dashboards.
INSIGHT

Tooling and Best Practices

  • Tooling encodes best practices, implicitly teaching users better approaches.
  • Data scientists may lack expertise in areas like security, so tools can bridge these gaps by handling these practices automatically.
Get the Snipd Podcast app to discover more snips from this episode
Get the app