Data Skeptic cover image

k-means clustering

Data Skeptic

00:00

How Many Clusters Should I Have?

K means clustering. If you select a large k and therefore use more clusters, those clusters obviously fit the data better. But at some nt they undoubtedly begin to overfit the data. To make a decision like, how many clusters should i have, most people will recommend the elbow method. It's based on two variables: average distance of all points in a cluster from their associated centroid. We assume that mathematical value will be very large. And as a result, we have a nice arithmatic way that we can score ourselves between minus one and one.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app