AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Ensuring AI Model Integrity in Programming Solutions
This chapter delves into protocols for preventing back-doored solutions in AI-created programming solutions, outlining methods such as editing solutions and untrusted monitoring with multiple GPT-4 instances. It discusses a toy scenario where trustworthy and untrusted models, limited human labor, and evaluation methodologies are used to safeguard against catastrophic outcomes.