Frequency Analysis
Paste any text to see how often each letter appears, compared side by side with the frequencies of written English. Read the bigram and trigram counts, check the Index of Coincidence to tell a monoalphabetic cipher from a polyalphabetic one, and export the table. Everything runs in your browser.
Try a sample:
Paste some text above and its letter frequencies, bigrams, trigrams and Index of Coincidence will appear here.
How to use Frequency Analysis
- 1
Paste your text
Copy the text or ciphertext you want to study and paste it into the box. Letters are counted without regard to case, and spaces, numbers and punctuation are ignored.
- 2
Read the summary
Check the character and letter counts, how many distinct letters appear, the most frequent letter, and the Index of Coincidence, which hints at whether one alphabet or several were used.
- 3
Study the letter-frequency chart
Compare each letter's bar against its English marker. Switch to 'By frequency' to rank the letters and see the overall shape — lumpy for a substitution cipher, flat for a polyalphabetic one.
- 4
Scan the bigrams and trigrams
Look at the most common pairs and triples. In a cipher, the top trigram is often a disguised THE, which hands you three letters at once.
- 5
Export or share
Download the frequency table as a CSV for your notes or spreadsheet, or copy a shareable link that reopens the tool with your exact text. Everything stays in your browser.
Letter frequency analysis, explained
What is frequency analysis?
Frequency analysis is the study of how often each letter, pair of letters, or triple of letters appears in a piece of text. Because the letters of a language are not used equally — E and T are everywhere in English while Q and Z are rare — the pattern of frequencies acts like a fingerprint. Counting that pattern is the oldest and most powerful technique in classical cryptanalysis, first written down by the Arab scholar al-Kindi in the ninth century.
This tool counts the letters in whatever you paste, shows each one as a bar next to the expected English frequency, lists the most common bigrams and trigrams, and reports the Index of Coincidence. Together these numbers tell you whether the text is ordinary writing, a simple substitution cipher, or something that uses several alphabets at once — without you having to count a single letter by hand.
Reading the letter-frequency chart
Each row is one letter of the alphabet. The filled bar shows how often that letter appears in your text as a percentage of all the letters, and the thin vertical marker shows the frequency of the same letter in typical English. When a bar reaches well past its marker, that letter is over-represented; when it falls short, the letter is rarer than usual. Switch the sort order to rank the letters from most to least frequent, which makes the shape of the distribution obvious at a glance.
In normal English the tallest bars are E, T, A, O, I and N, and the chart looks lumpy and uneven. A monoalphabetic cipher keeps that lumpy shape but slides the peaks to different letters, because each letter is simply swapped for another. A polyalphabetic cipher flattens the chart until every bar is roughly the same height, because the same plaintext letter is enciphered differently depending on its position. Recognising those two shapes is the single most useful skill in breaking classical ciphers.
The Index of Coincidence
The Index of Coincidence, or IoC, measures the probability that two letters drawn at random from the text are identical. Ordinary English sits around 0.067 because its frequencies are so uneven, while completely random text approaches 0.038, where every letter is equally likely. A single number captures how lumpy or flat the distribution is.
This makes the IoC the quickest test for telling cipher families apart. Caesar, Atbash and keyword substitution ciphers only relabel letters, so the uneven English profile survives and the IoC stays high, near 0.066. Vigenère and other polyalphabetic ciphers blend several alphabets, flattening the frequencies and dragging the IoC down toward 0.04. The tool prints the value with a short hint, so a high reading points you at a substitution cipher and a low one points you at a polyalphabetic cipher.
Bigrams, trigrams and contact patterns
Single letters are only the start. English also has strongly preferred letter pairs and triples: TH, HE, IN, ER and AN are the commonest bigrams, and THE, AND, ING and ENT dominate the trigrams. The tool lists the most frequent pairs and triples in your text, counting them only inside words so that a space never joins two unrelated letters into a false pair.
These contact patterns are invaluable when a simple letter count is not enough. In a substitution cipher the disguised version of THE often shows up as the most common trigram, giving you three letters at once. Repeated bigrams can betray the length of a Vigenère key through the Kasiski method. Even the absence of doubled letters, or a suspicious run of rare pairs, is a clue about which cipher you are facing.
Breaking ciphers with frequency analysis
To attack a monoalphabetic substitution cipher, sort the chart by frequency and line it up against English. The most common cipher letter is probably E, the next probably T, and the top trigram is probably THE. Pencil in those guesses, then use the bigram and trigram lists to extend them — once you know E and T, the pair TH and the word THE fall into place quickly, and the rest of the message unravels from there.
For a Caesar cipher the same logic is even simpler, because every letter moves by the same amount: find the shift that lines the cipher's peak up with English's E and you have the key. For a Vigenère cipher, frequency analysis still works, but only after you split the text into columns by the key length, since each column is then a separate Caesar cipher you can solve on its own. Knowing the Index of Coincidence first tells you whether this column trick is even necessary.
Monoalphabetic versus polyalphabetic at a glance
If you remember only one thing, make it this. A high Index of Coincidence and a lumpy chart with clear tall bars mean a monoalphabetic cipher, where each letter maps to exactly one other letter — Caesar, Atbash, affine, or a keyword substitution. These yield to frequency analysis directly, because the statistics of the plaintext shine straight through.
A low Index of Coincidence and a flat chart where every bar is about the same height mean a polyalphabetic cipher, where one plaintext letter can become many different cipher letters — Vigenère, Beaufort, Gronsfeld or Porta. These hide the raw letter frequencies, so you must first recover the key length and then analyse each position separately. The chart and the IoC tell you which of these two worlds you are in before you spend any effort.
Limits and good practice
Frequency analysis is statistical, so it needs enough text to be trustworthy. A short message of a dozen letters can show wildly misleading frequencies simply by chance, while a full paragraph settles close to the expected pattern. When a sample looks ambiguous, the usual cause is that it is too short rather than that the method has failed.
Keep in mind that the English baseline shown here is for ordinary prose. Specialised text — a list of names, a chunk of source code, or writing in another language — has its own profile and will not match. The tool ignores spaces, digits and punctuation and folds upper and lower case together, which is exactly what you want for classical ciphers, but it means it analyses letters only, not the structure of an encoding like Base64 or Morse. For those, identify the encoding first and decode it, then run frequency analysis on the letters underneath.
Frequently asked questions
What is frequency analysis?
How do I use frequency analysis to break a cipher?
What is the Index of Coincidence?
What is the difference between monoalphabetic and polyalphabetic?
Why does the tool show bigrams and trigrams?
What do the bars and the vertical marker mean?
How much text do I need for reliable results?
Does it work for languages other than English?
Can I analyze Base64, Morse or binary?
Is my text uploaded to a server?
Can I export the frequency table?
Related tools
Keep going with these handy tools