Last Week in AI cover image

#221 - OpenAI Codex, Gemini in Chrome, K2-Think, SB 53

Last Week in AI

00:00

LocoBench: Long-Context Software Engineering Benchmark

Andrey describes LocoBench's eight long-context SE tasks and new metrics; Michelle emphasizes realistic benchmarks for code tasks.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app