Efficient Deployment of Models at the Edge // Krishna Sridhar // #284
Jan 17, 2025
auto_awesome
In this engaging discussion, Krishna Sridhar, an engineering leader at Qualcomm and former co-founder of Tetra AI, dives into the efficient deployment of AI models at the edge. He shares insights on using Qualcomm AI Hub to optimize models for on-device performance, highlighting its application in real-time sports tracking and mobile photography. Krishna also explores the balance between hardware and software optimization in modern devices. Plus, he reveals how innovations in edge computing are transforming everyday AI applications while ensuring user privacy.
Qualcomm's AI Hub simplifies model deployment on edge devices, enabling developers to efficiently optimize and run machine learning applications.
Real-time applications like AI-enhanced cricket tracking illustrate how edge computing empowers advanced analytics for everyday users with cost-effective solutions.
Deep dives
Innovations in AI on the Edge
Qualcomm is focusing on enhancing AI deployment on the edge by simplifying the process for developers. Their Qualcomm AI Hub streamlines how machine learning models can be brought into the platform and efficiently run on Qualcomm chips, significantly reducing complexity. The platform allows developers to upload their trained models, identify the device they're targeting, and receive optimized code for quick deployment. This approach encourages rapid innovation and accessibility in using AI across various devices, from smartphones to IoT applications.
Real-Time Applications and Use Cases
The podcast highlights fascinating real-time applications of AI on edge devices, particularly with a mobile app created to track cricket matches. This app leverages the camera on a smartphone to provide real-time ball tracking and video highlights, performing functions previously limited to expensive systems like Hawkeye. Such innovations allow amateur players and enthusiasts to harness advanced analytics at a fraction of the cost, showcasing the power and potential of on-device AI. This example illustrates how edge AI can offer robust functionality in everyday recreational activities.
Balancing Computational Power and Efficiency
The conversation underscores a critical challenge in deploying AI models: the need for optimizing computational resources while maintaining efficiency. With the growth of large language models and other sophisticated AI techniques, memory has emerged as a significant bottleneck in performance, sparking discussions on better memory management strategies. Qualcomm's systems are designed to take advantage of heterogeneous computing, allowing tasks to be processed by different types of processors, thus achieving a balance of speed and energy consumption. This careful optimization is vital to ensure that AI technologies can be efficiently integrated into resource-constrained environments.
Future Innovations and Market Trends
Looking ahead, Qualcomm is aligning its AI technology with emerging trends such as generative AI and language models that can operate locally on devices, enhancing user experience without sacrificing privacy. The company is also exploring partnerships with various model creators to facilitate deploying pre-trained models directly onto Qualcomm devices, broadening the scope of applications. Insights also indicate that security and home automation are increasingly becoming hotspots for innovation, propelled by edge computing capabilities. This evolution points toward a future where intelligent devices can perform complex tasks seamlessly, reshaping everyday interactions.
Krishna Sridhar is an experienced engineering leader passionate about building wonderful products powered by machine learning.
Efficient Deployment of Models at the Edge // MLOps Podcast #284 with Krishna Sridhar, Vice President of Qualcomm.
Big shout out to Qualcomm for sponsoring this episode!
// Abstract
Qualcomm® AI Hub helps to optimize, validate, and deploy machine learning models on-device for vision, audio, and speech use cases.
With Qualcomm® AI Hub, you can:
Convert trained models from frameworks like PyTorch and ONNX for optimized on-device performance on Qualcomm® devices.
Profile models on-device to obtain detailed metrics including runtime, load time, and compute unit utilization.
Verify numerical correctness by performing on-device inference.
Easily deploy models using Qualcomm® AI Engine Direct, TensorFlow Lite, or ONNX Runtime.
The Qualcomm® AI Hub Models repository contains a collection of example models that use Qualcomm® AI Hub to optimize, validate, and deploy models on Qualcomm® devices.
Qualcomm® AI Hub automatically handles model translation from source framework to device runtime, applying hardware-aware optimizations, and performs physical performance/numerical validation. The system automatically provisions devices in the cloud for on-device profiling and inference. The following image shows the steps taken to analyze a model using Qualcomm® AI Hub.
// Bio
Krishna Sridhar leads engineering for Qualcomm™ AI Hub, a system used by more than 10,000 AI developers spanning 1,000 companies to run more than 100,000 models on Qualcomm platforms.
Prior to joining Qualcomm, he was Co-founder and CEO of Tetra AI which made its easy to efficiently deploy ML models on mobile/edge hardware.
Prior to Tetra AI, Krishna helped design Apple's CoreML which was a software system mission critical to running several experiences at Apple including Camera, Photos, Siri, FaceTime, Watch, and many more across all major Apple device operating systems and all hardware and IP blocks.
He has a Ph.D. in computer science from the University of Wisconsin-Madison, and a bachelor’s degree in computer science from Birla Institute of Technology and Science, Pilani, India.
// MLOps Swag/Merch
https://shop.mlops.community/
// Related Links
Website: https://www.linkedin.com/in/srikris/
--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: https://mlops.community/
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Krishna on LinkedIn: https://www.linkedin.com/in/srikris/
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode