
Data Rights, Quantification and Governance for Ethical AI with Margaret Mitchell - #572
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
00:00
Exploring Natural Language Metrics through Zipfian Distribution
This chapter explores the idea of 'naturalness' in data collection for natural language processing and the difficulties in defining it. It introduces the Zipfian distribution as a tool to quantify natural language and emphasizes the need for authentic data sets to enhance model training for better communication understanding.
Transcript
Play full episode