AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Megatron for Large Language Models
I'm imagining that what you have done with megatron for large language models, you will do for other models. You might want to do that across different types of models and different types of emphra structure. And i'm wondering if they're some abstracted learning from this exercise in this project that would guide some one that wants to do something similar for another type of model or another type of impra structure. Yes. In gidia thinks of our work as accelerated computing, which means optimizing full stack, from the application to the framework, to the libraries and compilers to the gpus,. Thet inter connects the systems, tat, it all goes in.