AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Monitoring and Ensuring Compliance in Large-Scale Machine Learning Training
This work focuses on a mechanism to monitor computing hardware used for large-scale machine learning training to ensure compliance with agreed rules. The system aims to provide high confidence to governments that no-actor users are not using specialized ML chips in violation of regulations, while still allowing the use of consumer computing devices and safeguarding the privacy of ML practitioners' models, data, and hyperparameters. The system involves interventions at three stages: using on-chip firmware to save snapshots of neural network weights, saving information about each training run to prove details to inspectors, and monitoring the chip supply chain to prevent evasion of discovery through untracked chips.