Content Quality: Readability Measurement and Analysis
Probeo measures content readability using established formulas and aggregate scoring. This page describes the scope of automated readability analysis, its limitations, and what the resulting scores do and do not tell you.
Last updated 02/08/2026
Probeo's content quality analysis measures readability using established formulas that quantify structural text properties: sentence length, syllable density, word familiarity, and character count. These formulas produce scores that correlate with reading difficulty under controlled conditions. They do not measure whether content is clear, accurate, well-organized, or appropriate for its audience. Readability scores are observability data, not quality verdicts.
What Probeo measures
Probeo applies 7 readability formulas to the visible text content of each page: Flesch-Kincaid Grade Level, Flesch Reading Ease, Gunning Fog Index, SMOG Grade, Automated Readability Index, Coleman-Liau Index, and Dale-Chall Readability Score. Each formula quantifies a different structural dimension of text. Syllable density, sentence length, word length, and vocabulary familiarity are the primary inputs. The outputs are numeric scores or grade-level estimates. Probeo also calculates 4 aggregate scores that combine individual formula outputs into composite metrics, reducing the bias of relying on any single formula.
Measurement, not judgment
Readability formulas were designed for specific contexts: military training materials, health literacy screening, educational publishing. Applying them to web content stretches each formula beyond its original design. A product page, a legal disclosure, an API reference, and a blog post will produce different scores because they serve different purposes and audiences. None of those scores indicate a problem on their own. Probeo surfaces readability data as measurement. It does not prescribe target scores, flag pages as failing, or recommend simplification. Whether a score warrants action depends on the content type, the intended audience, and the editorial intent behind the text.
What readability formulas cannot assess
Readability formulas operate on surface-level text statistics. They cannot evaluate whether an explanation is logically structured, whether a sentence conveys its intended meaning, whether technical terminology is appropriate for the audience, or whether the content achieves its communicative purpose. A page of grammatically simple nonsense will score as highly readable. A well-crafted technical explanation will score as difficult. Formulas also cannot account for formatting, visual hierarchy, illustrations, or interactive elements that affect how users actually process content. These are real dimensions of content quality that exist outside the measurement surface of any text-statistics formula.
Content types have legitimately different readability levels
Medical documentation, legal terms, API references, and consumer-facing product descriptions operate at different reading levels by design. A legal disclosure written at a sixth-grade reading level would likely be imprecise. An API reference simplified to avoid polysyllabic words would lose technical accuracy. Readability scores are most informative when compared within a content type, not across content types. A help article that scores three grade levels higher than similar help articles on the same site is worth investigating. That same score compared against a terms-of-service page is meaningless.
How teams use readability data
Readability scores serve three operational purposes. First, drift detection: tracking scores over time on the same page or content section reveals when complexity has shifted, often gradually and without anyone noticing. Second, consistency monitoring: comparing scores across pages of the same type identifies outliers that may warrant editorial review. Third, comparative benchmarking: aggregate scores across a content section provide a baseline that new content can be measured against. In all three cases, the scores identify where to look. They do not determine what to do.
Scope
Content quality analysis applies at the page level. Scores are calculated from the visible text content extracted during crawl. Navigation elements, footer text, and boilerplate are included in the extraction unless the page structure isolates them from the main content area. Readability scores are recalculated on each crawl, so they reflect the current state of the page content.
What becomes visible
- Structural readability measurements across all crawled pages using 7 established formulas
- Aggregate scores that synthesize individual formula outputs into composite metrics
- Readability drift over time as content is edited, expanded, or restructured
- Outlier pages within a content type that score significantly different from their peers
- The boundary between what automated text analysis can measure and what requires editorial judgment