
Arctic Embed with Luke Merrick, Puxuan Yu, and Charles Pierse - Weaviate Podcast #110!
Weaviate Podcast
00:00
The Economics of Model Pre-Training
This chapter explores the financial and strategic challenges of large-scale machine learning model pre-training, examining cost factors such as batch sizes and hardware resources like H100 GPUs. It highlights the significance of data quality over quantity and introduces concepts such as multi-stage training and Margin Ranking Loss (MRL). The discussion emphasizes the balance between model parameters, training duration, and performance across different languages, advocating for a nuanced approach to resource allocation and optimization in model training.
Transcript
Play full episode