
StrategyQA and Big Bench
Data Skeptic
00:00
AI2 Leaderboard
The best model reaches about almost 70, maybe 69 point accuracy. The challenge is that for a strategic UA, you actually have to do multiple steps to come up with the answer. And so using these structured models, the ones on the leaderboard, it just gives some advantage in this case.
Transcript
Play full episode