Michael Gschwind, Director/Principal Engineer for PyTorch at Meta Platforms, shares his insights on AI advancements. He discusses the evolution from gaming hardware to modern AI, highlighting the pivotal role of community collaboration. The conversation covers the development of Torch Chat for large language models, energy-efficient optimization techniques, and the exciting shift toward on-device AI solutions. Gschwind also emphasizes strategic optimization to avoid premature pitfalls in technology development.
57:44
forum Ask episode
web_stories AI Snips
view_agenda Chapters
auto_awesome Transcript
info_circle Episode notes
question_answer ANECDOTE
Gaming Console Accelerators
Michael Gschwind built the accelerator chip for the PlayStation 3.
He also worked on the Xbox 360.
insights INSIGHT
From Gaming to AI
AI's demand for performance aligned with Michael Gschwind's accelerator expertise.
This led him from gaming consoles to supercomputers and AI.
volunteer_activism ADVICE
Research is a Team Sport
Collaboration is essential for research and development.
Embrace open-source contributions and diverse perspectives to build better products.
Get the Snipd Podcast app to discover more snips from this episode
Dr. Michael Gschwind is a Director / Principal Engineer for PyTorch at Meta Platforms. At Meta, he led the rollout of GPU Inference for production services.
// MLOps Podcast #274 with Michael Gschwind, Software Engineer, Software Executive at Meta Platforms.
// Abstract
Explore the role in boosting model performance, on-device AI processing, and collaborations with tech giants like ARM and Apple. Michael shares his journey from gaming console accelerators to AI, emphasizing the power of community and innovation in driving advancements.
// Bio
Dr. Michael Gschwind is a Director / Principal Engineer for PyTorch at Meta Platforms. At Meta, he led the rollout of GPU Inference for production services. He led the development of MultiRay and Textray, the first deployment of LLMs at a scale exceeding a trillion queries per day shortly after its rollout. He created the strategy and led the implementation of PyTorch donation optimization with Better Transformers and Accelerated Transformers, bringing Flash Attention, PT2 compilation, and ExecuTorch into the mainstream for LLMs and GenAI models. Most recently, he led the enablement of large language models on-device AI with mobile and edge devices.
// MLOps Swag/Merch
https://mlops-community.myshopify.com/
// Related Links
Website: https://en.m.wikipedia.org/wiki/Michael_Gschwind
--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: https://mlops.community/
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Michael on LinkedIn: https://www.linkedin.com/in/michael-gschwind-3704222/?utm_source=share&utm_campaign=share_via&utm_content=profile&utm_medium=ios_app
Timestamps:
[00:00] Michael's preferred coffee
[00:21] Takeaways
[01:59] Please like, share, leave a review, and subscribe to our MLOps channels!
[02:10] Gaming to AI Accelerators
[11:34] Torch Chat goals
[18:53] Pytorch benchmarking and competitiveness
[21:28] Optimizing MLOps models
[24:52] GPU optimization tips
[29:36] Cloud vs On-device AI
[38:22] Abstraction across devices
[42:29] PyTorch developer experience
[45:33] AI and MLOps-related antipatterns
[48:33] When to optimize
[53:26] Efficient edge AI models
[56:57] Wrap up