Click "Generate" to create text from the Markov model.
Controls
Markov Babbler
A Markov babbler generates text by learning word-level transition probabilities from a corpus. Given the last n words (the context), it randomly selects the next word weighted by how often that transition appeared in the training text.
Low-order models (bigram) produce chaotic but diverse text. Higher-order models (trigram, 4-gram) produce more coherent phrases but tend to reproduce the source verbatim and hit dead ends more often.
How to Use
- Select a corpus or paste your own text
- Adjust n-gram order to see how context length affects output
- Generate to watch text build word by word
- Step to advance one word at a time
- The context graph shows candidate next words and their probabilities
- When the model hits a dead end, it backs off to a shorter context
N-gram Text Generation
- Tokenize the corpus into words
- Build a table mapping each n-word context to observed next words
- Pick a starting context
- Sample the next word from the context's distribution
- Slide the context window forward and repeat
function buildModel(corpus, order):
tokens = tokenize(corpus)
model = {}
for i = 0 to len(tokens) - order:
context = tokens[i..i+order]
next = tokens[i+order]
model[context][next] += 1
return model
function generate(model, order, maxWords):
context = randomStart(model)
output = [...context]
for i = 1 to maxWords:
candidates = model[context]
if candidates is empty:
context = backoff(context) // shorter context
next = sampleWeighted(candidates)
output.append(next)
context = output[last order words]
return output
Backoff
When no continuation is found for the full context, the model backs off to a shorter context (e.g., from trigram to bigram). This prevents dead ends while producing less constrained output.
Conditional Probability
P(wn | wn-k ... wn-1) = count(wn-k...wn) / count(wn-k...wn-1)
The probability of the next word given the context is estimated by maximum likelihood: the ratio of n-gram counts to (n-1)-gram counts.
Entropy
H = − Σw P(w|ctx) log2 P(w|ctx)
Entropy measures the uncertainty of the next word. High entropy means many equally-likely candidates; low entropy means the model is very confident about what comes next.
Perplexity
PP = 2H
Perplexity is the exponential of entropy — roughly the "effective number of choices" at each step. A perplexity of 10 means the model is as uncertain as if choosing uniformly among 10 words.
Model Metrics
| Corpus | - | Vocabulary | - |
| N-gram Order | 2 | Context | - |
| Candidates | - | Entropy | - |
| Generated | 0 | Status | Ready |
Next Word Candidates
Generate text to see candidate probabilities.