Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Apprendre Numbers, Stats, and How They Mislead | The Toolkit
Critical Thinking in the Age of AI

Numbers, Stats, and How They Mislead

Glissez pour afficher le menu

Here's a real pattern that appears constantly in AI-generated content, business reports, and news articles alike:

"Companies that use this approach see a 67% improvement in outcomes."

Sounds compelling. Now ask: 67% of what baseline? Measured how? Over what time period? On which population? In which industry?

Without answers to those questions, 67% is not a data point. It's a decoration.

The Three Manipulations That Show Up Most

Relative vs. absolute change

A drug reduces the risk of a condition from 2% to 1%. That's a 50% relative risk reduction — which sounds dramatic. It's also a 1 percentage point absolute risk reduction — which sounds modest. Both statements are mathematically correct. Which one you lead with determines how the reader feels about the evidence.

AI outputs frequently use relative numbers because they're larger and sound more impactful. Any time you see a percentage change, ask: what's the baseline? What's the absolute change?

Percentages without denominators

"80% of users reported satisfaction." 80% of how many users? 10? 10,000? A survey of 10 people with 8 satisfied is not the same claim as a study of 50,000 people. The percentage hides the sample size, and the sample size determines how much weight the finding deserves.

Correlation presented as causation

Two things that move together don't necessarily cause each other. Ice cream sales and drowning rates both peak in summer — neither causes the other. AI is particularly prone to presenting correlated findings as causal because causal language is common in its training data. "Companies that invest in X see higher revenue" — is that because X drives revenue, or because profitable companies can afford to invest in X?

The Questions That Catch Statistical Manipulation

Four questions that work on almost any statistic:

  • "X percent of what?" — find the denominator;
  • "Compared to what?" — find the baseline for any change claim;
  • "Who measured this, and how?" — assess the methodology;
  • "Does this correlation imply causation, or just association?" — separate the two explicitly. These questions don't require a statistics degree. They require 30 seconds of deliberate attention before accepting a number as evidence.

AI and Statistical Confabulation

AI has a specific statistical failure mode that compounds all of the above: it fabricates plausible-sounding numbers.

When asked for a statistic it doesn't have, an LLM will sometimes generate one that fits the pattern of statistics in its training data — a realistic-looking percentage, a plausible study sample size, a credible-sounding source. The number looks exactly like a real finding because it's modeled on real findings.

The fix is the same as for any specific AI claim: if a number matters, trace it to a verifiable source. If it can't be traced, it doesn't count as evidence.

Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 2. Chapitre 5

Demandez à l'IA

expand

Demandez à l'IA

ChatGPT

Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion

Section 2. Chapitre 5
some-alt