AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Automated Feedback and Alignment in AI
This chapter explores the concept of automated feedback in AI systems, where the systems evaluate their own outputs or are evaluated by other AI systems to provide feedback for training. It discusses the challenges of aligning these systems and the potential risks of training AI models. The chapter also discusses the importance of better evaluations in AI safety and proposes the idea of iterated distill amplification as a potential solution.