The Thesis Review cover image

[46] Yulia Tsvetkov - Linguistic Knowledge in Data-Driven NLP

The Thesis Review

00:00

The Problem With Large Language Models

The problem is that probably there is not enough data, even for huge language models, to represent the richness of linguistic diversities. So there will be always some domains and areas which are extremely low resource. That's why no matter how big retraining corpus is, at the time when you have these static models that only use retraining data, we will encounter that language models break on just how the language has changed.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app