InstructGPT e RLHF para melhorar a conversação

Filipe pergunta como alinhar modelos e explica as três etapas: pré-treinamento, fine-tuning supervisionado e RLHF.

Play episode from 11:21

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!