The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Neural Synthesis of Binaural Speech From Mono Audio with Alexander Richard - #514

Aug 30, 2021
In this discussion, Alexander Richard, a research scientist at Facebook Reality Labs and ICLR Best Paper Award winner, shares insights into his groundbreaking work on binaural audio synthesis. He dives into the challenges of audio representation in noisy environments and the complex process of generating realistic spatial audio from mono sources. Richard also highlights the difficulties of dynamic time warping and the need for accurate 3D measurements in virtual reality. His thoughts on Codec Avatars and future research directions promise to reshape how we experience sound and presence in virtual spaces.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Accidental AI Career

  • Alexander Richard stumbled into machine learning while studying in Germany.
  • He was fascinated by a speech recognition demo and pursued that field, eventually landing at Facebook Reality Labs.
INSIGHT

Facebook Reality Labs Mission

  • Facebook Reality Labs emerged from Oculus Research and focuses on social telepresence in AR/VR.
  • Their mission is to enable realistic, 3D conversations in virtual reality.
INSIGHT

Audio's Power in AR/VR

  • Audio provides cues that visual data often lacks, especially in VR/AR settings.
  • Audio can help fill in gaps in visual information, like lip movements obscured by beards or headsets.
Get the Snipd Podcast app to discover more snips from this episode
Get the app