
Sara Hooker: Cohere For AI, the Hardware Lottery, and DL Tradeoffs
The Gradient: Perspectives on AI
00:00
Is There a Long Tail in the Model?
The approach to learn in these mappings for a small subset of examples, where or our model is spending an enormous amount of capacity basically forming a look up table, could be better solved in the data pipe line. It's almost like espending so much capacity just to memorize these examples. Clearly the rate of learning is very low on these examples. And clean up the data, said, am verse is very typical. Can we do something to upraight or to think differently about those examples? Sure.
Transcript
Play full episode