AI Today Podcast

AI Today Podcast: Why Data Storage Matters When it Comes to AI: Interview with Justin Emerson, Pure Storage

May 8, 2023
Justin Emerson, FlashBlade Technical Evangelist at Pure Storage, shares insights on the crucial role of data storage in AI development. He discusses the challenges of managing unstructured data and the need for scalable storage solutions for complex AI models. The conversation explores the trade-offs between cloud and on-prem storage, emphasizing planning against vendor lock-in. Justin highlights the importance of energy-efficient storage and stresses a step-by-step approach to AI projects, advocating for learning from failures to drive success.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Data Is The Fuel Driving AI

  • AI success depends primarily on data quality, quantity, and accessibility rather than just algorithms.
  • Storage architecture must be designed to industrialize data pipelines for training and analytics.
ADVICE

Design Storage For Pretraining Workflows

  • Plan for the heavy data preparation phase: labeling, transforming, and operational tasks consume significant time and storage.
  • Design storage and performance up front so data is accessible when the dataset enters training frameworks.
ADVICE

Choose Cloud For Agility, Own For Scale

  • Use cloud for agility and bursts, but assess cost for steady, large-scale training workloads.
  • Own on-prem infrastructure when sustained baseline usage makes ownership more economical than renting cloud resources.
Get the Snipd Podcast app to discover more snips from this episode
Get the app