MLOps.community  cover image

AWS Tranium and Inferentia // Kamran Khan and Matthew McClean // #238

MLOps.community

00:00

Inference vs. Training on Accelerators

Exploring the distinctions between 'inference' and 'training' on accelerators like Inferentia and Tranium, highlighting the hardware requirements, network connectivity, and the deployment of large language models. Additionally, touching on the utilization of EC2 instances for both inference and training purposes, with mentions of Kubernetes, neuron devices configuration, SLIRM interface, collaborations with partners like Ray, and support for platforms like Metaflow.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app