AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
How to Apply Bert to a Problem?
Bert is like a hundred million perameter model, which unless you have a lot of data, it's not really justified. Most of the interesting stuff was happening in those just surface layers. I mean, you could really reduce the model down, take away the 700 odd hidden layers and still get the same level of accuracy. We suspect this has to do with the physical properties of the ameno acids themselves.