
[46] Yulia Tsvetkov - Linguistic Knowledge in Data-Driven NLP
The Thesis Review
The Problem With Large Language Models
The problem is that probably there is not enough data, even for huge language models, to represent the richness of linguistic diversities. So there will be always some domains and areas which are extremely low resource. That's why no matter how big retraining corpus is, at the time when you have these static models that only use retraining data, we will encounter that language models break on just how the language has changed.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.