#146 - ChatGPT’s 1 year anniversary, DeepMind GNoME, Extraction of Training Data from LLMs, AnyDream

Last Week in AI

Defining the Next Level of Artificial General Intelligence (AGI)

3min Snip

00:00

Play full episode

Summary

Transcript

Episode notes

A new paper introduces the concept of Level 3 AGI, which aims to achieve performance comparable to skilled adults in various tasks across industries. To evaluate the development of this expert AGI, a new benchmark called MMMU has been proposed with 11,500 carefully selected multimodal questions covering 30 diverse subjects and 183 subfields. The benchmark assesses both breadth and depth of reasoning abilities, forcing models to solve complex multimodal problems that include images. This benchmark, which expands beyond language to include heterogeneous image types, is designed to evaluate the capabilities of expert AGI models and may become a standard evaluation tool for multimodal AGI models in the future.

Our 146th episode with a summary and discussion of last week's big AI news!

Note: this one is coming out a bit late, sorry! We'll have a new ep with coverage of the big news about Gemini and the EU AI Act out soon though.

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

Email us your questions and feedback at contact@lastweekin.ai

Timestamps + links:

(00:00:00) Intro/Banter
Tools & Apps
- (00:02:03) ChatGPT’s 1-year anniversary: how it changed the world
- (00:06:15) Perplexity AI Introduces New Online LLMs for Real-Time Information Access
- (00:11:45) Intuit Adds Generative AI-Powered Tax Prep to TurboTax
- (00:12:57) Microsoft Paint’s DALL-E 3 integration is rolling out on Windows 11
- (00:13:56) Mastercard launches Shopping Muse, an AI to help consumers find the perfect gift
- (00:14:54) Voicemod will now let you create and share your own AI voices
- (00:16:15) Amazon finally releases its own AI-powered image generator
Applications & Business
Projects & Open Source
- (00:39:31) ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?
- (00:43:50) China Open Sources DeepSeek LLM, Outperforms Llama 2 and Claude-2
Research & Advancements
Policy & Safety
Synthetic Media & Art
- (01:22:34) AnyDream: Secretive AI Platform Broke Stripe Rules to Rake in Money from Nonconsensual Pornographic Deepfakes
(01:25:02) Outro

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.

App store banner

Play store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode