There's some amount by which you incentivize manipulation or deception. For example, it's pretty likely that you ask the AI questions to try and figure out how good a job it did. And you might be incentivizing it to hide from you some mistakes it made so that you think that it does a better job. Right now the state of the art way to get AI systems to do things for humans is this human feedback. The fundamental limit to human feedback is you can only give it the thumbs down when it does bad things if you can tell that it's doing bad things.
The U.S. surgeon general, Dr. Vivek Murthy, says social media poses a “profound risk of harm” to young people. Why do some in the tech industry disagree?
Then, Ajeya Cotra, an A.I. researcher, on how A.I. could lead to a doomsday scenario.
Plus: Pass the hat. Kevin and Casey play a game they call HatGPT.
On today’s episode:
Ajeya Cotra is a senior research analyst at Open Philanthropy
Additional reading:
The surgeon general issued an advisory about the risks of social media for young people.
Ajeya Cotra has researched the existential risks that A.I. poses unless countermeasures are taken.