Linear Digressions cover image

Feature Processing for Text Analytics

Linear Digressions

00:00

Getting Rid of Stop Words in Text Processing

S if you want to represent your entire data set, what you're going to have is all of these thousand dimensional vectors times ever many documentstat you have in your corpus. So it'll be a thousand by 20 or by 50, or by a hundred,. or by a million, however many documents you have. You're going to just make a matrix where you stack all those vectors on top of each other and stick them into your machine learning algorithm. And the zerois back to nothing. S If you want to go the other way, you can't necessarily reconstruct the entire document, but you at the very least reconstruct of all the possible words, which ones are in which documents - that

Play episode from 05:01
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app