
"Discussion with Nate Soares on a key alignment difficulty" by Holden Karnofsky
LessWrong (Curated & Popular)
00:00
The Future of Alignment Research
Holden nominates the following as a thing Nate should update on. We get AIs doing some really useful alignment related research, despite some combination of three points here. The AIs being trained using methods basically similar to what's going on now. And or there's a distinct lack of science that they are doing much to for example reflect in quotes or reconcile conflicting goals. What does really useful mean? I'd think the bar should clearly be cleared by something like an AI or pretty easy to point out set of AIsbeing in some sense,. maybe roughly comparable to the sense described in an article Batter arithmetic promising at math by Colin McCauley. That's end of Nate's note
Play episode from 35:27
Transcript


