
Training superhuman coding models at Cursor
Cursor
00:00
Using Production Inference for Training
Speakers consider reusing user inference for RL and trade-offs when rollouts equal real user actions.
Play episode from 40:38
Transcript

Speakers consider reusing user inference for RL and trade-offs when rollouts equal real user actions.