People increasingly using GPD4 to study the text that it itself generates. Last, we have training diffusion models with reinforced learning. Instead of having just this matching likelihood objective, you can do this multi-step denoising operation. They make a reward via another language model.