
Managing Research Needs at the University of Michigan using Kubernetes w/ Bob Killen - #344
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
00:00
Managing Long-Running Workloads in Machine Learning and High-Performance Computing
This chapter explores the complexities involved in managing long-running jobs within machine learning and high-performance computing settings. It discusses institutional support for research needs, implications of job duration limits, and the impact of increasing cloud-native application usage on AI workloads.
Transcript
Play full episode