Clustering Space
Click to add data points

Controls

Ready

What is K-Means?

K-Means is an unsupervised clustering algorithm that partitions data into K groups by iteratively assigning points to the nearest centroid and updating centroids to the mean of their assigned points.

How to Use

  • Click canvas to add data points
  • Choose a dataset preset to explore patterns
  • Set K to the desired number of clusters
  • Press Play to animate K-Means step by step
  • Press Step to advance one iteration at a time
  • Toggle Voronoi to see cluster regions
  • Toggle History to see centroid movement trails

Custom Mode

Select "Custom" dataset to draw your own data points. Switch between Add/Delete modes to create your dataset.

K-Means Steps

  1. Initialize: Randomly place K centroids in the data space
  2. Assign: Assign each point to the nearest centroid (Euclidean distance)
  3. Update: Move each centroid to the mean position of its assigned points
  4. Repeat: Alternate between assign and update until centroids stop moving (convergence) or max iterations reached

Initialization

Centroids are initialized by selecting K random positions within the data range. Different initializations can lead to different final clusters.

Convergence

The algorithm converges when centroids move less than a small threshold between iterations (typically < 0.001).

Objective Function

J = Σk Σi ∈ Ck ||xi − μk||²

Within-cluster sum of squares (inertia). K-Means minimizes this quantity.

Assignment Rule

Ck = {xi : ||xi − μk|| ≤ ||xi − μj|| ∀ j}

Each point is assigned to the cluster with the nearest centroid.

Update Rule

μk = (1/|Ck|) Σi ∈ Ck xi

Each centroid moves to the mean of its assigned points.

Cluster Metrics

Points 0
K 3
Iteration -
Inertia (WCSS) -
Converged -
Status Add points to begin