Incremental Model Development for Safer AI

53sec Snip

00:00

Play full episode

Summary

Transcript

Episode notes

There is a consideration about whether building AI models in a more gradual and incremental manner, as opposed to large discrete jumps, could result in safer AI. This method involves releasing smaller updates to the model which would allow for smoother transitions and less likelihood of sudden new exploits appearing in the next version of the model.

Read the full transcript here.

Why do we leave so much low-hanging fruit unharvested in so many parts of life? In what contexts is it better to do a thing than to do a symbolic representation of the thing, and vice versa? How can we know when to try to fix a problem that hasn't yet been fixed? In a society, what's the ideal balance of explorers and exploiters? What are the four simulacra levels? What is a moral "maze"? In the context of AI, can solutions for the problems of generation vs. evaluation also provide solutions for the problems of alignment and safety? Could we solve AI safety issues by financially incentivizing people to find exploits (à la cryptocurrencies)?

Zvi Mowshowitz is the author of Don't Worry About the Vase, a widely spanning substack trying to help us think about, model, and improve the world. He is a rationalist thinker with experience as a professional trader, game designer and competitor, and startup founder. His blog spans diverse topics and is currently focused on extensive weekly AI updates. Read his writings at thezvi.substack.com, or follow him on Twitter / X at @TheZvi.

Staff