The Future of Altruism

The Stanford NLP lab is working on the next generation, the successor to transformers. They're trying to get medium term storage memory built into an accessible expert model that's 1000 times smaller than what a transformers model currently is. So they're looking at how do we use current, like relatively cheap GPU hardware? And how do we offload the model weights so that we're doing some sort of intelligent fetch or querying a vector DB service instead of having to, you know, read files into memory.

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app