
27 - AI Control with Buck Shlegeris and Ryan Greenblatt
AXRP - the AI X-risk Research Podcast
00:00
Exploring the challenges of inserting backdoors and evading detection in programming language
Exploring the complexities of inserting backdoors in code, the chapter highlights the challenges in detecting malicious actions and strategies like modifying code behavior with if statements. It also touches on the effectiveness of various versions in identifying backdoors and the role of GPT-4 in recognizing such intents.
Transcript
Play full episode