Hacker News Recap cover image

August 4th, 2023 | (next Rich)

Hacker News Recap

00:00

The Nondeterminism of GPU Calculations in AI-ML

GPT-4 and potentially GPT-35 turbo may not be a floating point calculation bug, but related to the sparse mixture of experts. In models such as GPT-4, tokens within the same batch may compete for available spots in expert buffers. Using a script which counts unique completions from different chat and completion models, confirmed the higher nondeterminism in GPT- 4. There were differing opinions on the quality of code in the ML slash AI slash DS fields.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app