
MLCommons’ David Kanter, NVIDIA’s Daniel Galvez on Publicly Accessible Datasets - Ep. 167
NVIDIA AI Podcast
00:00
The Multi Linguistic Speaked Words Corpus
The multi lingual spoken words corpus is in 50 different languages. It's got 20 t million, ah, examples with hundreds of thousands of key words. For many of these languages, it is the first such data set that exists. You know, i think about upranian, which is spoken by, you know, tens of millions of people, and there is no such extant data set before.
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.