A new paper introduces the concept of Level 3 AGI, which aims to achieve performance comparable to skilled adults in various tasks across industries. To evaluate the development of this expert AGI, a new benchmark called MMMU has been proposed with 11,500 carefully selected multimodal questions covering 30 diverse subjects and 183 subfields. The benchmark assesses both breadth and depth of reasoning abilities, forcing models to solve complex multimodal problems that include images. This benchmark, which expands beyond language to include heterogeneous image types, is designed to evaluate the capabilities of expert AGI models and may become a standard evaluation tool for multimodal AGI models in the future.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode