The Jim Rutt Show cover image

Currents 087: Shivanshu Purohit on Open-Source Generative AI

The Jim Rutt Show

00:00

The Effect of Order of Training on Model Memorization

The order of training actually doesn't affect how the models memorize something. We found out that the model still like memorize a lot of stuff, especially if the data is very sparse. If you don't pre process your text good enough and say for some reason someone's credit card information ended up in your data set, even if it was just one instance  the model can more or less verbatim memorize the information. And if you open source it or like give the checkpoint to someone, the other person can just like extract the information if they know what they're doing.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app