Addressing Risks in Transformative AI Systems through Factored Cognition

This chapter explores Cristiano's approach to mitigating risks in AI systems by using factored cognition, where the desired goal is decomposed into small steps for validation by human overseers. The effectiveness of this approach is debated, considering potential vulnerabilities in the computer system and the limitations of formal proofs in general reasoning.

Play episode from 10:01

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app