Word Frequency Analyzer — Count Words, Check Keyword Density & Analyze N-Grams
Paste any text into the word frequency analyzer and instantly see every word ranked by how often it appears, complete with counts, percentages, and a visual density bar. The tool works entirely in your browser — your text never leaves your device — and handles articles, essays, code documentation, or manuscripts up to 100,000 words with no signup or installation required.
Beyond simple word counting, you can choose between unigrams (single words), bigrams (two-word phrases), and trigrams (three-word sequences) to get a complete picture of how language distributes across your content. Results export to CSV or JSON for further analysis in Excel, Python, or R.
How to Use the Word Frequency Analyzer
- Paste your text — drop any content into the input area: a blog post, article, essay, or legal document
- Set your filters — toggle stop word filtering, pick n-gram size (words, bigrams, trigrams), and set a minimum word length
- Read the frequency table — every term appears ranked by count with its density percentage and visual bar
- Export your data — download as CSV or JSON for deeper analysis in any tool
Key Metrics Explained
Word frequency analysis counts how many times each word or phrase appears and expresses it as a percentage of all words. This simple calculation underpins decisions in SEO, editorial review, and computational linguistics research.
Keyword density — (count ÷ total words) × 100. The widely accepted healthy range for primary SEO keywords is 1–2.5%. Below 0.5% may signal weak topical coverage; above 3% risks triggering keyword stuffing filters.
Type-Token Ratio (TTR) — unique words ÷ total words × 100. A TTR of 60% means 60% of the words in your text are distinct, not repeated. TTR below 40% in longer texts often correlates with repetitive, lower-quality writing.
N-grams — unigrams (single words), bigrams ("content marketing"), trigrams ("search engine optimization"). Bigram and trigram analysis reveals which multi-word phrases dominate your content — often more useful than unigram counts since search engines increasingly rank for phrase-level relevance.
Stop words — common function words (the, a, is, of) that add little semantic meaning. Filtering them surfaces the vocabulary that truly defines your content's topic.
Common Use Cases
- SEO content auditing: Verify your primary keyword hits 1–2.5% density. Switch to bigrams to confirm it appears as a natural phrase, not just scattered words. Compare your top n-grams against top-ranking competitor pages to close coverage gaps.
- Catching overused words: A word appearing 40+ times in 1,000 words is a red flag for editors. Use the frequency list to substitute synonyms and improve prose variety. The TTR score quantifies vocabulary range before submission.
- Academic text analysis: Identify dominant themes in interview transcripts or survey responses. Export to CSV for statistical processing in Python (pandas) or R.
- Essay review: Check that key argument terms appear with appropriate frequency and that vocabulary diversity meets academic writing standards.
- Technical documentation audits: Find over-repeated jargon in API docs, README files, or user manuals that might signal unclear or redundant content.
Frequently Asked Questions
What is a good keyword density for SEO?
Most SEO practitioners recommend 1–2.5% as a healthy range for primary keywords. Below 0.5% may signal insufficient topical coverage, weakening relevance signals. Above 3% risks triggering keyword stuffing filters in algorithms like Google's Helpful Content system. More important than a specific number is that the keyword appears naturally in the title, H1, first 100 words, subheadings, and conclusion — not crammed into sentences where it feels unnatural.
What are stop words and when should I filter them?
Stop words are extremely common function words (the, a, is, of, in, that, and) that carry minimal semantic meaning on their own. Filtering them reveals the content-bearing vocabulary that defines your text's topic. For SEO analysis, always filter stop words — without filtering, your top results will be dominated by "the", "and", "of", making the data useless. For linguistic research where function words matter (stylometrics, authorship analysis), keep them enabled.
What is the difference between unigrams, bigrams, and trigrams?
Unigrams are individual words. Bigrams are two-word sequences like "content marketing" or "keyword density". Trigrams are three-word sequences like "search engine optimization". For SEO, bigram and trigram analysis is often more actionable than unigrams alone — search queries are typically phrases, and confirming that your target phrase appears as a natural unit (not just its individual words scattered through the text) gives a more accurate picture of topical alignment.
What is Type-Token Ratio (TTR) and why does it matter?
TTR = (unique words ÷ total words) × 100. A text with 600 unique words out of 1,000 total has a TTR of 60%. Higher TTR indicates greater vocabulary diversity, which correlates with more natural, higher-quality writing. For long-form content (1,000+ words), TTR below 40% often signals repetitive language that can hurt readability and content quality signals in search algorithms.
Is my text sent to any server?
No. All analysis runs entirely in your browser using JavaScript. Your text — including private drafts, client content, or confidential documents — is never transmitted to any server or stored anywhere. You can even use this tool offline once the page has loaded.
How do I export the results?
Click CSV to download a spreadsheet-compatible file with rank, word, count, and density for every term. Click JSON for structured data compatible with Python, JavaScript, R, and any data processing pipeline. Both formats include all terms in the full analysis — not just the ones visible on screen based on your display limit.
Resources
- Zipf's Law — Wikipedia — the mathematical principle behind word frequency distribution: the most common word appears roughly twice as often as the second most common, three times as often as the third, and so on across any large text.
- Google Search Central — Spam Policies — Google's official definition of keyword stuffing and how it affects search rankings.