AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Human Eval: A Benchmark for Open Source
The team at LLM Packers is developing a bespoke Python model. They are benchmarking it against the Codex paper, which will open up EI around 2021. It's called human eval and they're hiring people to work on these problems. "We ran human evil out of curiosity," says Cevallos. 'For us, it was about what are users getting out of it?'