Classify Message Ready
Word Probabilities 0 words
Word P(w|Spam) P(w|Ham) Log LR
Classify a message to see word probabilities
Training Data

Controls

1.0

What is Naive Bayes?

Naive Bayes is a probabilistic classifier based on Bayes' theorem. It predicts the class of an input by computing the posterior probability of each class given the observed features.

Why "Naive"?

The algorithm is called "naive" because it assumes that all features (words) are conditionally independent given the class. While this assumption is rarely true in practice, the classifier often performs surprisingly well.

How to Use

  • Type a message and click Classify to see the prediction
  • Examine the word probability table to see per-word contributions
  • Adjust α to see the effect of Laplace smoothing
  • Add training data to improve the classifier
  • Toggle log probabilities to see the math behind the scenes

Training Phase

  1. Count the number of spam and ham messages
  2. Compute prior probabilities: P(Spam), P(Ham)
  3. Tokenize each message into words
  4. Count word frequencies per class
  5. Compute P(word|class) with Laplace smoothing

Classification Phase

  1. Tokenize the input message
  2. For each class, compute log P(class) + Σ log P(word|class)
  3. The class with the highest score wins
  4. Convert log-scores to probabilities via softmax

Bayes Theorem

P(C|x) = P(x|C) · P(C) / P(x)

Where C is the class and x is the feature vector (words).

Naive Independence Assumption

P(x₁,x₂,...,xₙ|C) = Π P(xᵢ|C)

Each word is assumed independent given the class.

Laplace Smoothing

P(w|C) = (count(w,C) + α) / (total(C) + α·|V|)

α prevents zero probabilities for unseen words. |V| is the vocabulary size. When α=1 this is called add-one smoothing.

Log-Space Computation

log P(C|x) ∝ log P(C) + Σ log P(xᵢ|C)

We work in log-space to avoid floating-point underflow from multiplying many small probabilities.

Classifier Metrics

Vocabulary Size 0
Spam Count 0
Ham Count 0
Classification -
Confidence -
Smoothing α 1.0
Status Ready