
Episode 27: Noam Brown, FAIR, on achieving human-level performance in poker and Diplomacy, and the power of spending compute at inference time
Generally Intelligent
00:00
Is Chain of Thought a Good Idea?
Chain of thought is very rudimentary relative to other planning. Montegoich research says, let me improve what I would do in the future and get a better estimate of what I should be doing right now. You have this really nice value function in these recreational games that you don't have with all natural language generation tasks. Co-generation is one example where you could have a value function, at least in theory.
Transcript
Play full episode