
LLMs on Azure - ML 123
Adventures in Machine Learning
00:00
The Future of Altruism
The Stanford NLP lab is working on the next generation, the successor to transformers. They're trying to get medium term storage memory built into an accessible expert model that's 1000 times smaller than what a transformers model currently is. So they're looking at how do we use current, like relatively cheap GPU hardware? And how do we offload the model weights so that we're doing some sort of intelligent fetch or querying a vector DB service instead of having to, you know, read files into memory.
Transcript
Play full episode