Loading training data...

Distribution Comparison
L = 2.302
True Label Correct Incorrect Predicted: ?

MNIST Samples

Training Timeline

Epoch: 0/25

The Formula

L = -Σi yi · log(pi)

For one-hot labels (single true class k):

L = -log(pk)

Intuition

  • pk = 1: L = 0 (perfect)
  • pk = 0.5: L = 0.69
  • pk = 0.1: L = 2.30
  • pk → 0: L → ∞

Loss Breakdown

Cross-Entropy Loss

Cross-entropy measures the difference between two probability distributions: the expected (true labels) and the predicted (model output).

Key Insights

  • Lower is better - Loss of 0 means perfect prediction
  • Confident & wrong - Highest loss scenario
  • Uniform prediction - Loss = log(n) for n classes

How to Use

  • Training mode: Watch predictions evolve during training
  • Click digits to see different samples
  • Play button animates through epochs
  • Manual mode: Drag bars to explore loss