The fundamental problem here is not so much as whether these systems understand what we want. It's if they care. If you give it a jailbreak prompt and it's all out the damn window, like it'll do who knows what. And this is get back to the block box problem of not understanding what the insights are. This will not be acceptable when we're dealing with powerful agentics systems.
Read the full transcript here.
Does AI pose a near-term existential risk? Why might existential risks from AI manifest sooner rather than later? Can't we just turn off any AI that gets out of control? Exactly how much do we understand about what's going on inside neural networks? What is AutoGPT? How feasible is it to build an AI system that's exactly as intelligent as a human but no smarter? What is the "CoEm" AI safety proposal? What steps can the average person take to help mitigate risks from AI?
Connor Leahy is CEO and co-founder of Conjecture, an AI alignment company focused on making AI systems boundable and corrigible. Connor founded and led EleutherAI, the largest online community dedicated to LLMs, which acted as a gateway for people interested in ML to upskill and learn about alignment. With capabilities increasing at breakneck speed, and our ability to control AI systems lagging far behind, Connor moved on from the volunteer, open-source Eleuther model to a full-time, closed-source model working to solve alignment via Conjecture.
Staff
Music
Affiliates