GPT Simulations for the Alignment Problem

A partially observed simulation is more efficient to compute, but requires a simulator to model semantics. A couple of pages of text severely under-determines the real world process that generated it so GPT simulations are likewise under-determined. Conditioning can be controlled to an impressive extent by prompt programming. It is not clear how out-of-distribution conditions will be interpreted by powerful simulators. How do we expect pre-trained simulators to diverge from the simulation objective? What kind of conditional distributions could be used for training data for a simulator? We might also control generative models by conditioning on latents.

Play episode from 01:35:10

Transcript

Episode notes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app