AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Challenges of Running a Machine Learning Workload on a Device
The key challenges in making the deep learning workloads run on the device include the large model size and the inference latency./nThe model size increased significantly from less than 100 million parameters to 1.1 billion parameters, adding complexity to the workload./nThe denoising steps and the default stage further contribute to the complexity of running the workload on the device./nThe automation and robustness of the software stack played a crucial role in handling the increased workload./nThe second challenge was reducing the inference latency, which required research and optimization across the entire stack from model system algorithms to hardware./nQuantization research, specifically the Adarond technique, allowed the model to be efficiently run on the device without retraining./nThe Adarond technique improved the signal to noise ratio, resulting in higher quality pixel generation./nThe Adarond technique provided a more holistic approach to quantization, considering per channel and per tensor basis./nThe Adarond technique effectively added one more bit, resulting in a significant increase in signal to noise ratio./nThe Adarond technique allowed for the efficient data format without the need for model retraining.