Artificial General Intelligence (AGI) Show with Soroush Pour cover image

Ep 11 - Technical alignment overview w/ Thomas Larsen (Director of Strategy, Center for AI Policy)

Artificial General Intelligence (AGI) Show with Soroush Pour

CHAPTER

Interpreting AI and Model Evaluation

This chapter focuses on the two big buckets of work in the field of AI: Interpreting AI and Model Evaluation. It discusses the emergence of dangerous capabilities in AI models, the problem of deceptive alignment, and the challenges of using proxies as reward signals.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner