Controls
What is K-Means?
K-Means is an unsupervised clustering algorithm that partitions data into K groups by iteratively assigning points to the nearest centroid and updating centroids to the mean of their assigned points.
How to Use
- Click canvas to add data points
- Choose a dataset preset to explore patterns
- Set K to the desired number of clusters
- Press Play to animate K-Means step by step
- Press Step to advance one iteration at a time
- Toggle Voronoi to see cluster regions
- Toggle History to see centroid movement trails
Custom Mode
Select "Custom" dataset to draw your own data points. Switch between Add/Delete modes to create your dataset.
K-Means Steps
- Initialize: Randomly place K centroids in the data space
- Assign: Assign each point to the nearest centroid (Euclidean distance)
- Update: Move each centroid to the mean position of its assigned points
- Repeat: Alternate between assign and update until centroids stop moving (convergence) or max iterations reached
Initialization
Centroids are initialized by selecting K random positions within the data range. Different initializations can lead to different final clusters.
Convergence
The algorithm converges when centroids move less than a small threshold between iterations (typically < 0.001).
Objective Function
J = Σk Σi ∈ Ck ||xi − μk||²
Within-cluster sum of squares (inertia). K-Means minimizes this quantity.
Assignment Rule
Ck = {xi : ||xi − μk|| ≤ ||xi − μj|| ∀ j}
Each point is assigned to the cluster with the nearest centroid.
Update Rule
μk = (1/|Ck|) Σi ∈ Ck xi
Each centroid moves to the mean of its assigned points.
Cluster Metrics
| Points | 0 |
| K | 3 |
| Iteration | - |
| Inertia (WCSS) | - |
| Converged | - |
| Status | Add points to begin |