The Bootstrapped Founder

404: The Transcription Challenge: Building Infrastructure That Scales With The World

41 snips
Jul 18, 2025
Discover the challenges of managing an overwhelming amount of audio data while building scalable transcription infrastructure. The speaker delves into innovative strategies for ensuring high-quality transcriptions despite varying podcast quality and volume. Learn how efficient systems are crucial for keeping up with the booming podcast industry. This insightful discussion offers valuable takeaways for anyone interested in transcription technology and podcasting.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Building for Podcast Scale

  • Arvid built PodScan to transcribe all global podcast episodes, regardless of customer count.
  • He tracks about 3.8 million shows and tens of thousands of daily new episodes.
ADVICE

Use Queues with Priority Levels

  • Treat transcribing podcasts as a queuing system with priority tiers.
  • Prioritize high-impact shows like Joe Rogan's for faster transcription.
ANECDOTE

Local Mac Studio Transcription

  • Arvid ran his initial transcription queue locally on his Mac Studio using whisper.cpp.
  • His Mac used the unified memory system to transcribe about 200 words per second.
Get the Snipd Podcast app to discover more snips from this episode
Get the app