Astral Codex Ten Podcast cover image

Perhaps It Is A Bad Thing That The World's Leading AI Companies Cannot Control Their AIs

Astral Codex Ten Podcast

00:00

I Love Racism - I Love It

Open AI created a chatbot that would never say it loved racism. The main strategy was the same one Redwood used for their AI, our LHF, reinforcement learning by human feedback. Getting punished for I love racism will also make them less likely to say, I love sexism.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app