Guardrails, RL training, and real-world risks

Jeffrey discusses RLHF, constitutional AI, guardrail limitations, and rising hacking capabilities of advanced models.

Play episode from 37:23

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!