Using a Trigonometrical Algorithm I'm Defining a Second Progress Measure - Excluded Loss

The algorithm depends on the directions in input token space that this trigonometric algorithm depends on. This is a kind of weird thing to do and reasonable to take issue with okay so at this point now that we understand the paper a bit better there's also a second progress measure oh sorry yes we called excluded loss which is where you delete the ten key directions on the training data. The drum memorization mostly tracks train loss but then over the course of circuit formation it diverges and gets worse and worse. So, I'm not doing fiddling then running things through the model which is akind of weirdthing to do andreasonable to take issues with OK? We're trying to get

Play episode from 03:09:46

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app