Google DeepMind: The Podcast cover image

Me, myself and AI

Google DeepMind: The Podcast

CHAPTER

How to Co-Train Voices Together

When deep mind launched wavenet in two thousand and 16 you needed about four hours worth of audio samples from a person to model how their voice sounds. But now you can do it with just a few minutes worth of audio. Google has built an enormous data set with professional voice actors reading out the same text. The model learns from all o these samples how particular words are pronounced. Now the third and final part is the acoustic modelling. Acoustic modelling focuses on who it sounds like. If i pretend to sound like my brother on the phone, it still sounds like me. My friend will be able to tell it. Mif i say the sentence with a different tone of voice, you

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner