Why You Can't Train a Prude Network?

In late 20 17, people hadn't beaten that dead horse yet. And i thought, what if we just were scientists about it? What are all the other hypotheses for why you couldn't train a prude network? And one of them was, maybe it's about initialization. So i tried taking a network that had the minimum capacity necessary to learn xor and gave it a reasonableInitialization regime in said, i don't know if those were good hyper brameters but it seemed reasonable. I logged the number of times it found the right decision boundary to get a hard zero or one output on these four data points for exclusive one. It was about 20 % of the time it got

Play episode from 12:24

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app