
#183 - OpenAI o1, Adobe vid gen, Reflection 70B, DeepMind AlphaProteo
Last Week in AI
Maximize Performance with Right Tools
The performance of two versions of a model, the larger O1 preview and the smaller O1 mini, showcases remarkable capabilities. The larger model achieves impressive results, ranking in the 89th percentile on competitive programming tasks, indicating superiority over 90% of human participants. It also exceeds human PhD level accuracy in graduate-level questions across physics, biology, and chemistry. Additionally, the model demonstrates an effective multimodal capability with a 78.2% score on the challenging MMMU benchmark, despite being trained solely through text modality. These metrics underline the advancement of AI in complex reasoning tasks.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.