

NLP research by & for local communities
10 snips Jan 3, 2023
Just Zwennicker, a data engineer who created a machine translation system for Sranan Tongo, and Bonaventure Dossou, a Ph.D. student focusing on low-resource languages, explore the importance of NLP for local language communities. They discuss the challenges of data scarcity and the need for representation in technology, particularly for African languages. The conversation highlights grassroots initiatives like Masakane, which empowers native speakers to enhance linguistic diversity in NLP systems, ensuring their languages thrive in the digital age.
AI Snips
Chapters
Transcript
Episode notes
Data Scarcity and Collaboration
- NLP models often get trained on high-resource languages, neglecting others.
- Direct collaboration with linguists and native speakers is crucial, especially for under-resourced languages.
Language Structure in NLP
- Consider unique language structures when building NLP models, not just high-resource languages.
- Collaborate with linguists to understand the nuances of each language.
Creole Languages and Stigmatization
- Creole languages emerged from diverse linguistic backgrounds, often due to historical events like colonization.
- They often face stigmatization, impacting their documentation and resource availability.