
The True Cost of Compute
a16z Podcast
00:00
The estimated cost of training GPT-3 and other considerations
GPT-3 has 175 billion parameters, which requires a massive amount of computational power. Training the model alone involves 3x10^23 floating point operations. Renting an A100 card and running it at full capacity would cost around half a million dollars. However, this analysis is simplistic and doesn't account for optimization, memory bandwidth limitations, network limitations, and the need for multiple test runs. In reality, training a large language model costs millions of dollars, even tens of millions in industry. The need for reserve capacity further increases the cost, often adding a zero to the training cost.
Transcript
Play full episode