AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Economics of Model Pre-Training
This chapter explores the financial and strategic challenges of large-scale machine learning model pre-training, examining cost factors such as batch sizes and hardware resources like H100 GPUs. It highlights the significance of data quality over quantity and introduces concepts such as multi-stage training and Margin Ranking Loss (MRL). The discussion emphasizes the balance between model parameters, training duration, and performance across different languages, advocating for a nuanced approach to resource allocation and optimization in model training.