Defining the Next Level of Artificial General Intelligence (AGI)

A new paper introduces the concept of Level 3 AGI, which aims to achieve performance comparable to skilled adults in various tasks across industries. To evaluate the development of this expert AGI, a new benchmark called MMMU has been proposed with 11,500 carefully selected multimodal questions covering 30 diverse subjects and 183 subfields. The benchmark assesses both breadth and depth of reasoning abilities, forcing models to solve complex multimodal problems that include images. This benchmark, which expands beyond language to include heterogeneous image types, is designed to evaluate the capabilities of expert AGI models and may become a standard evaluation tool for multimodal AGI models in the future.

Play episode from 01:02:44

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app