Deep Papers cover image

TUMIX: Multi-Agent Test-Time Scaling with Tool-Use Mixture

Deep Papers

00:00

Clarifying metrics: accuracy vs. coverage

Yongchao defines accuracy as average agent score and coverage as upper-bound if any agent answers correctly.

Play episode from 15:56
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app