Practical AI

Speech tech and Common Voice at Mozilla

Sep 9, 2020
Join Jenny Zhang from Mozilla, focused on the Common Voice project, Remy Muhire, passionate about VoiceTech, and Josh Meyer, who champions African language tech. They explore the biases in speech data affecting language and accent recognition. Discover Mozilla’s inclusive approach to creating an open-source voice database. The trio also discusses challenges in gathering diverse datasets for marginalized communities, particularly in Sub-Saharan Africa, and emphasizes the need for ethical data practices to support underrepresented languages.
Ask episode
AI Snips
Chapters
Books
Transcript
Episode notes
INSIGHT

Common Voice's Origin

  • In 2017, open-source speech data was scarce, English-centric, and lacked diversity.
  • Mozilla Common Voice aimed to democratize speech tech by crowdsourcing diverse voice data.
INSIGHT

Data Needs for Speech Recognition

  • The amount of speech data needed for speech recognition depends on the application's complexity.
  • While simple tasks may require minimal data, robust models need around 2,000 hours of transcribed speech.
ANECDOTE

Community Validation

  • Common Voice uses community validation where volunteers determine if audio matches text.
  • This unorthodox approach prioritizes community involvement and diverse noise environments.
Get the Snipd Podcast app to discover more snips from this episode
Get the app