AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Limits of Gradient Descent
Hunter: It could well be true that you only need gradient descent to find the optimal. But how good is that optimal, right? Like we know language models are good, but what's the peak of that? Logan: I do expect different architectures to pop out from like automating in my research.