AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Testing Memorization vs Intelligence in Model Training
Model training using internet text for random facts results in good tested memorization rather than intelligence or creativity. Performance on standard benchmarks like MMOU and Big Bench, which rely on multiple-choice questions, shows that having a lot of random facts memorized does not indicate intelligence or creativity. Even with advanced models like Google's Gemini Ultra, performance on these benchmarks seems to be plateauing. Common benchmarks do not measure long-horizon task performance, such as the ability to complete a job over a month with limited data points for learning.