
138 - Compositional Generalization in Neural Networks, with Najoung Kim
NLP Highlights
The Average Embedding Side Didn't Work, Right?
The performance was degraded on this like IID novel words data set compared to just like the IID test set that they had like 99% performance on. So there was degradation, but not to the degree of like 5% generalization accuracy. But it's unclear whether it's because the models were not being able to use them or if these embeddings are too different from existing embeddings. It seems tricky to get this to work, right? And how do you introduce new embeddings for these ones without... while also making sure that they're not too different? They seem to have constratives.Yeah. Let me ... Maybe I'll mention another experiment in the paper
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.