Distribution Comparison
L = 2.302
True Label
Correct
Incorrect
Predicted: ?
MNIST Samples
Training Timeline
Epoch: 0/25
The Formula
L = -Σi yi · log(pi)
For one-hot labels (single true class k):
L = -log(pk)
Intuition
- pk = 1: L = 0 (perfect)
- pk = 0.5: L = 0.69
- pk = 0.1: L = 2.30
- pk → 0: L → ∞
Loss Breakdown
Cross-Entropy Loss
Cross-entropy measures the difference between two probability distributions: the expected (true labels) and the predicted (model output).
Key Insights
- Lower is better - Loss of 0 means perfect prediction
- Confident & wrong - Highest loss scenario
- Uniform prediction - Loss = log(n) for n classes
How to Use
- Training mode: Watch predictions evolve during training
- Click digits to see different samples
- Play button animates through epochs
- Manual mode: Drag bars to explore loss