
27 - AI Control with Buck Shlegeris and Ryan Greenblatt
AXRP - the AI X-risk Research Podcast
00:00
Uncovering Hidden Reasoning and AI Challenges
Exploring the manual prompting process to reveal hidden reasoning in AI models, along with the challenges of fine-tuning models for specific tasks and addressing safety filter issues.
Transcript
Play full episode