Latent Space: The AI Engineer Podcast cover image

Artificial Analysis: The Independent LLM Analysis House — with George Cameron and Micah Hill-Smith

Latent Space: The AI Engineer Podcast

00:00

GDPVal AA: Agentic White-Collar Tasks

Micah describes adapting OpenAI's GDP‑Val into an agentic harness (Stir) and using Gemini 3 Pro as an evaluator model.

Play episode from 40:00
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app