Numbers, Stats, and How They Mislead
Scorri per mostrare il menu
Here's a real pattern that appears constantly in AI-generated content, business reports, and news articles alike:
"Companies that use this approach see a 67% improvement in outcomes."
Sounds compelling. Now ask: 67% of what baseline? Measured how? Over what time period? On which population? In which industry?
Without answers to those questions, 67% is not a data point. It's a decoration.
The Three Manipulations That Show Up Most
Relative vs. absolute change
A drug reduces the risk of a condition from 2% to 1%. That's a 50% relative risk reduction — which sounds dramatic. It's also a 1 percentage point absolute risk reduction — which sounds modest. Both statements are mathematically correct. Which one you lead with determines how the reader feels about the evidence.
AI outputs frequently use relative numbers because they're larger and sound more impactful. Any time you see a percentage change, ask: what's the baseline? What's the absolute change?
Percentages without denominators
"80% of users reported satisfaction." 80% of how many users? 10? 10,000? A survey of 10 people with 8 satisfied is not the same claim as a study of 50,000 people. The percentage hides the sample size, and the sample size determines how much weight the finding deserves.
Correlation presented as causation
Two things that move together don't necessarily cause each other. Ice cream sales and drowning rates both peak in summer — neither causes the other. AI is particularly prone to presenting correlated findings as causal because causal language is common in its training data. "Companies that invest in X see higher revenue" — is that because X drives revenue, or because profitable companies can afford to invest in X?
The Questions That Catch Statistical Manipulation
Four questions that work on almost any statistic:
- "X percent of what?" — find the denominator;
- "Compared to what?" — find the baseline for any change claim;
- "Who measured this, and how?" — assess the methodology;
- "Does this correlation imply causation, or just association?" — separate the two explicitly. These questions don't require a statistics degree. They require 30 seconds of deliberate attention before accepting a number as evidence.
AI and Statistical Confabulation
AI has a specific statistical failure mode that compounds all of the above: it fabricates plausible-sounding numbers.
When asked for a statistic it doesn't have, an LLM will sometimes generate one that fits the pattern of statistics in its training data — a realistic-looking percentage, a plausible study sample size, a credible-sounding source. The number looks exactly like a real finding because it's modeled on real findings.
The fix is the same as for any specific AI claim: if a number matters, trace it to a verifiable source. If it can't be traced, it doesn't count as evidence.
Grazie per i tuoi commenti!
Chieda ad AI
Chieda ad AI
Chieda pure quello che desidera o provi una delle domande suggerite per iniziare la nostra conversazione