LessWrong (Curated & Popular) cover image

"Discussion with Nate Soares on a key alignment difficulty" by Holden Karnofsky

LessWrong (Curated & Popular)

00:00

The Future of Alignment Research

Holden nominates the following as a thing Nate should update on. We get AIs doing some really useful alignment related research, despite some combination of three points here. The AIs being trained using methods basically similar to what's going on now. And or there's a distinct lack of science that they are doing much to for example reflect in quotes or reconcile conflicting goals. What does really useful mean? I'd think the bar should clearly be cleared by something like an AI or pretty easy to point out set of AIsbeing in some sense,. maybe roughly comparable to the sense described in an article Batter arithmetic promising at math by Colin McCauley. That's end of Nate's note

Play episode from 35:27
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app