NVIDIA AI Podcast cover image

MLCommons’ David Kanter, NVIDIA’s Daniel Galvez on Publicly Accessible Datasets - Ep. 167

NVIDIA AI Podcast

00:00

The Multi Linguistic Speaked Words Corpus

The multi lingual spoken words corpus is in 50 different languages. It's got 20 t million, ah, examples with hundreds of thousands of key words. For many of these languages, it is the first such data set that exists. You know, i think about upranian, which is spoken by, you know, tens of millions of people, and there is no such extant data set before.

Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner