Shahar Avin–Intelligence Rising, AI Governance

The Inside View

00:00

How to Red Team AI Systems

Red teaming is an adversarial relationship where you pretend to be the adversary and break whatever system or process you have in place. I think there's more and more adversarial examples when building self-driving cars. With the rising of language models, people can just test models like HP3 and find ways in which to make these behaviors. You might worry that when you get to very advanced systems, this method no longer works and you need different kinds of solutions.

Transcript

chevron_right

Play full episode

chevron_right

Transcript

Episode notes

Shahar is a senior researcher at the Center for the Study of Existential Risk in Cambridge. In his past life, he was a Google Engineer, though right now he spends most of your time thinking about how to prevent the risks that occur if companies like Google end up deploying powerful AI systems, by organizing AI Governance role-playing workshops.

In this episode, we talk about a broad variety of topics, including how we could apply the lessons from running AI Governance workshops to governing transformative AI, AI Strategy, AI Governance, Trustworthy AI Development and end up answering some Twitter questions.

Youtube: https://youtu.be/3T7Gpwhtc6Q

Transcript: https://theinsideview.ai/shahar

Host: https://twitter.com/MichaelTrazzi

Shahar: https://www.shaharavin.com

Outline

(00:00) Highlights

(01:20) Intelligence Rising

(06:07) Measuring Transformative AI By The Scale Of Its Impact

(08:09) Comprehensive AI Services

(11:38) Automating CEOs Through AI Services

(14:21) Towards A "Tech Company Singularity"

(15:58) Predicting AI Is Like Predicting The Industrial Revolution

(19:57) 50% Chance Of Human-brain Performance By 2038

(22:25) AI Alignment Is About Steering Powerful Systems Towards Valuable Worlds

(23:51) You Should Still Worry About Less Agential Systems

(28:07) AI Strategy Needs To Be Tested In The Real World To Not Become Theoretical Physics

(31:37) Playing War Games For Real-time Partial-information Advesarial Thinking

(34:50) Towards World Leaders Playing The Game Because It’s Useful

(39:31) Open Game, Cybersecurity, Government Spending, Hard And Soft Power

(45:21) How Cybersecurity, Hard-power Or Soft-power Could Lead To A Strategic Advantage

(48:58) Cybersecurity In A World Of Advanced AI Systems

(52:50) Allocating AI Talent For Positive R&D ROI

(57:25) Players Learn To Cooperate And Defect

(01:00:10) Can You Actually Tax Tech Companies?

(01:02:10) The Emergence Of Bilateral Agreements And Technology Bans

(01:03:22) AI Labs Might Not Be Showing All Of Their Cards

(01:06:34) Why Publish AI Research

(01:09:21) Should You Expect Actors To Build Safety Features Before Crunch Time

(01:12:39) Why Tech Companies And Governments Will Be The Decisive Players

(01:14:29) Regulations Need To Happen Before The Explosion, Not After

(01:16:55) Early Regulation Could Become Locked In

(01:20:00) What Incentives Do Companies Have To Regulate?

(01:23:06) Why Shahar Is Terrified Of AI DAOs

(01:27:33) Concrete Mechanisms To Tell Apart Who We Should Trust With Building Advanced AI Systems

(01:31:19) Increasing Privacy To Build Trust

(01:33:37) Sensibilizing To Privacy Through Federated Learning

(01:35:23) How To Motivate AI Regulations

(01:37:44) How Governments Could Start Caring About AI risk

(01:39:12) Attempts To Regulate Autonomous Weapons Have Not Resulted In A Ban

(01:40:58) We Should Start By Convincing The Department Of Defense

(01:42:08) Medical Device Regulations Might Be A Good Model Audits

(01:46:09) Alignment Red Tape And Misalignment Fines

(01:46:53) Red Teaming AI systems

(01:49:12) Red Teaming May Not Extend To Advanced AI Systems

(01:51:26) What Climate change Teaches Us About AI Strategy

(01:55:16) Can We Actually Regulate Compute

(01:57:01) How Feasible Are Shutdown Swi

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

Home Top podcasts Popular guests Top books