
Training superhuman coding models at Cursor
Cursor
00:00
Tests as Rewards and Their Limits
Discussion of using tests for RL rewards, their sparsity, and risks of gaming weak test suites.
Play episode from 03:01
Transcript

Discussion of using tests for RL rewards, their sparsity, and risks of gaming weak test suites.