Linear Digressions cover image

Feature Processing for Text Analytics

Linear Digressions

00:00

Using Bigrams and Trigrams in Machine Learning

Bag of words is unaware of anything about order or relationships between the words, except to say that maybe they occur in documents together more often. So one of the other things that you can do when you're representing text data is you can form what are called n grams. And n grams, where n is there can be one one grams, i guess those are unigrams, that's like a single word. Bigrams are pairs of two words that occur together. Trigrams are three words in a row, and so on. Usually you don't go higher than maybe four or five grahams in practice. But the idea there is now, instead of in quoting words,

Play episode from 06:43
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app