
27 - AI Control with Buck Shlegeris and Ryan Greenblatt
AXRP - the AI X-risk Research Podcast
00:00
Uncovering Hidden Reasoning and AI Challenges
Exploring the manual prompting process to reveal hidden reasoning in AI models, along with the challenges of fine-tuning models for specific tasks and addressing safety filter issues.
Play episode from 02:06:27
Transcript


