Language Model Training Data

There is some work towards saint liin we maybe we don't want lots of pornography in our language model training data. And so there's this practice, i'm sorry i don't have the guy's name, but there's this list of 400 samod very bad words, men that we're up there on gita. Is it ya yet? I've got a couple of really nice examples of that for you.

Play episode from 01:02:54

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app