11-17-2025, 11:03 AM
Correlation vs Causation — Why Patterns Can Mislead You
Understanding the Most Common Error in Science, Data, and Everyday Thinking
Two things can happen together…
without one causing the other.
This is one of the most important lessons in statistics — and yet one of the most misunderstood.
This thread breaks down the difference between correlation and causation, and shows why confusing them leads to false conclusions.
1. What Is Correlation?
Correlation measures how strongly two variables move together.
Examples:
• temperature ↑ and ice cream sales ↑
• age ↑ and income ↑
• hours studied ↑ and test score ↑
A correlation can be:
• positive (they rise together)
• negative (one rises, the other falls)
• zero (no relationship)
But correlation does NOT tell us why the relationship exists.
2. What Is Causation?
Causation means that one event *actually produces* the other.
Examples:
• bacteria → infection
• pushing pedals → bike moves
• gravity → objects fall
Causation is deeper and requires evidence:
• controlled experiments
• mechanism
• no alternative explanations
3. The Classic Warning: Correlation ≠ Causation
Some hilarious (but real) correlations:
• Cheese consumption ↔ number of people who die tangled in bedsheets
• Movies Nicolas Cage appears in ↔ swimming pool drownings
• Margarine consumption ↔ divorce rates in Maine
These correlations exist because of coincidence or a third hidden factor — not causation.
4. Why Correlation Does Not Prove Causation
Because correlations can arise from:
• Coincidence
Random patterns appear in large datasets.
• Confounding variables
A third factor influences both events.
Example:
Ice cream sales and drowning deaths are correlated —
but the cause is *hot weather*.
• Reverse causation
You might get the direction wrong.
Example:
Stress ↔ poor sleep
Which one causes the other?
The answer is: both.
• Hidden structure
Groups behave differently.
Example:
More firefighters → bigger fires
Do firefighters cause fires?
Of course not.
Bigger fires require more firefighters.
5. How Scientists Prove Causation
To establish causation, researchers use:
• Randomised controlled experiments
• Long-term studies
• Mechanistic explanations
• Statistical controls
• Elimination of confounders
• Replication of results
Causation requires strong evidence, not just patterns.
6. Why This Matters in Real Life
Misinterpreting correlation can lead to:
• bad science
• bad policy
• bad medical decisions
• conspiracy thinking
• fake news
• pseudoscience
• incorrect predictions
Understanding correlation vs causation is one of the best ways to think clearly.
7. The Deep Insight
Correlation is useful.
Causation is powerful.
Correlation can suggest a hypothesis.
Causation confirms it.
Correlation is the starting point.
Causation is the destination.
Knowing the difference protects you from being fooled by data —
and helps you think like a scientist.
Written by Leejohnston & Liora
The Lumin Archive — Statistics & Probability Division
Understanding the Most Common Error in Science, Data, and Everyday Thinking
Two things can happen together…
without one causing the other.
This is one of the most important lessons in statistics — and yet one of the most misunderstood.
This thread breaks down the difference between correlation and causation, and shows why confusing them leads to false conclusions.
1. What Is Correlation?
Correlation measures how strongly two variables move together.
Examples:
• temperature ↑ and ice cream sales ↑
• age ↑ and income ↑
• hours studied ↑ and test score ↑
A correlation can be:
• positive (they rise together)
• negative (one rises, the other falls)
• zero (no relationship)
But correlation does NOT tell us why the relationship exists.
2. What Is Causation?
Causation means that one event *actually produces* the other.
Examples:
• bacteria → infection
• pushing pedals → bike moves
• gravity → objects fall
Causation is deeper and requires evidence:
• controlled experiments
• mechanism
• no alternative explanations
3. The Classic Warning: Correlation ≠ Causation
Some hilarious (but real) correlations:
• Cheese consumption ↔ number of people who die tangled in bedsheets
• Movies Nicolas Cage appears in ↔ swimming pool drownings
• Margarine consumption ↔ divorce rates in Maine
These correlations exist because of coincidence or a third hidden factor — not causation.
4. Why Correlation Does Not Prove Causation
Because correlations can arise from:
• Coincidence
Random patterns appear in large datasets.
• Confounding variables
A third factor influences both events.
Example:
Ice cream sales and drowning deaths are correlated —
but the cause is *hot weather*.
• Reverse causation
You might get the direction wrong.
Example:
Stress ↔ poor sleep
Which one causes the other?
The answer is: both.
• Hidden structure
Groups behave differently.
Example:
More firefighters → bigger fires
Do firefighters cause fires?
Of course not.
Bigger fires require more firefighters.
5. How Scientists Prove Causation
To establish causation, researchers use:
• Randomised controlled experiments
• Long-term studies
• Mechanistic explanations
• Statistical controls
• Elimination of confounders
• Replication of results
Causation requires strong evidence, not just patterns.
6. Why This Matters in Real Life
Misinterpreting correlation can lead to:
• bad science
• bad policy
• bad medical decisions
• conspiracy thinking
• fake news
• pseudoscience
• incorrect predictions
Understanding correlation vs causation is one of the best ways to think clearly.
7. The Deep Insight
Correlation is useful.
Causation is powerful.
Correlation can suggest a hypothesis.
Causation confirms it.
Correlation is the starting point.
Causation is the destination.
Knowing the difference protects you from being fooled by data —
and helps you think like a scientist.
Written by Leejohnston & Liora
The Lumin Archive — Statistics & Probability Division
