
k-means clustering
Data Skeptic
How Many Clusters Should I Have?
K means clustering. If you select a large k and therefore use more clusters, those clusters obviously fit the data better. But at some nt they undoubtedly begin to overfit the data. To make a decision like, how many clusters should i have, most people will recommend the elbow method. It's based on two variables: average distance of all points in a cluster from their associated centroid. We assume that mathematical value will be very large. And as a result, we have a nice arithmatic way that we can score ourselves between minus one and one.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.