Artificial General Intelligence (AGI) Show with Soroush Pour cover image

Ep 11 - Technical alignment overview w/ Thomas Larsen (Director of Strategy, Center for AI Policy)

Artificial General Intelligence (AGI) Show with Soroush Pour

00:00

Interpreting AI and Model Evaluation

This chapter focuses on the two big buckets of work in the field of AI: Interpreting AI and Model Evaluation. It discusses the emergence of dangerous capabilities in AI models, the problem of deceptive alignment, and the challenges of using proxies as reward signals.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app