Structure to Sequence Models Without Structural Priors Don't Do Well on Generalization

Sequence to sequence models without structural priors don't really do well on structural generalization. Structural generalization is only 20% of the original Cogs generalization set. So this fact makes it seem like models are doing pretty well overall. But what you will notice if you look at the literature a little bit is that most of the pure sequence to sequence based approaches tend to saturate around 80 something percent. And my speculation is that it's because they're getting almost 0% on the structural cases and are getting all of the lexical cases, right? Although not often they report this like lexical structural division.

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app