Machine Learning Street Talk (MLST)

#77 - Vitaliy Chiley (Cerebras)

Jun 16, 2022
Vitaliy Chiley, a Machine Learning Research Engineer at Cerebras Systems, dives into the revolutionary hardware that accelerates deep learning workloads. He discusses the efficiency of Cerebras' architecture compared to traditional GPUs and the importance of memory management. Chiley explores the impact of sparsity in neural networks, debating the trade-offs between weight and activation sparsity. With insights on optimizing deep learning models, he also touches on why starting with dense networks can be beneficial before moving towards sparsity.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Memory Efficiency

  • Memory management and data movement, not FLOPS, are key for next-gen HPC computation.
  • Cerebras prioritizes on-chip memory for high bandwidth, unlike NVIDIA's off-chip approach.
INSIGHT

Sparsity Advantage

  • While GPUs excel at dense matrix multiplication, sparse matrices render much computation useless.
  • Cerebras' architecture supports sparsity, which becomes increasingly important as neural networks grow.
INSIGHT

Cerebras Sparsity Solution

  • Cerebras addresses sparsity not by compacting memory but by quickly accessing non-zero elements at runtime.
  • This dataflow architecture avoids propagating zero activations, unlike traditional sparse computation.
Get the Snipd Podcast app to discover more snips from this episode
Get the app