AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Language Model Training Data
There is some work towards saint liin we maybe we don't want lots of pornography in our language model training data. And so there's this practice, i'm sorry i don't have the guy's name, but there's this list of 400 samod very bad words, men that we're up there on gita. Is it ya yet? I've got a couple of really nice examples of that for you.