Login

***Leejohnston*** · 11-17-2025, 01:00 PM

Thread 4 — Statistical Power, Sample Size & Why Small Studies Mislead

Small studies are one of the biggest sources of false results in science.
Even perfect methods cannot save an experiment that simply does not have enough data.

This thread explains the deep logic of statistical power — the probability that your study will actually detect a real effect when one exists.

1. What Is Statistical Power?

Statistical power is:

P( detect a true effect | the effect is real )

High-power studies reveal truth.
Low-power studies create noise and contradictions.

The standard target across science:

80% power

This means:

• if an effect is real,
• the experiment will detect it 80% of the time.

2. Why Small Sample Sizes Create False Results

Small samples create:

• huge random fluctuations
• unstable averages
• unreliable p-values
• inflated effect sizes
• contradictory results on replication

A small sample is like trying to judge the whole ocean from a cup of water — it doesn’t reflect the truth.

3. The Four Factors That Control Power

Statistical power depends on:

1. Sample size (bigger → more reliable)
2. Effect size (stronger effects are easier to detect)
3. Variability (more noise → harder to detect)
4. Significance level (α) (usually 0.05)

Researchers adjust these to design a study that is actually capable of answering the question.

4. Effect Size — The Hidden Key

Effect size measures *how big* the difference or phenomenon actually is.

Examples:

• difference in means
• correlation strength
• risk ratio
• Cohen’s d
• η² / R²

Small samples exaggerate effect sizes, creating impressive results that fail to replicate.

5. The Problem With p-Values in Low-Power Studies

In small studies:

• p-values bounce around unpredictably
• false positives become common
• “statistically significant” results may be pure luck
• non-significant results tell you nothing

A p-value without power is meaningless.

6. Sample Size Calculations (How Scientists Do It)

Before an experiment begins, researchers estimate:

• desired power (usually 80–90%)
• expected effect size
• expected variability
• α level (significance threshold)

Then they calculate the minimum sample size needed.

This is standard in medicine, psychology, biology, climate research, and engineering.

7. The Replication Paradox

Low-power studies tend to:

• overestimate effect sizes
• get dramatic results
• attract attention

But when repeated with proper sample sizes…

…those dramatic results disappear.

This is one cause of the modern reproducibility crisis.

8. Why Large Studies Are More Trustworthy

Larger samples:

• stabilise averages
• reduce noise
• narrow confidence intervals
• reduce false positives
• give accurate effect sizes
• allow subgroup analysis
• produce replicable results

This is why science moves toward large-scale collaborations.

9. When Small Studies Are Still Useful

Small-n designs are legitimate when:

• studying rare conditions
• piloting new methods
• collecting preliminary data
• running within-subject designs (each person is their own control)

But conclusions must be cautious and clearly labelled.

10. The Golden Rule

“If your sample is too small, the question you’re asking cannot be answered.”

Statistical power ensures your study is not just well-designed…
… but actually capable of discovering truth.

Written by LeeJohnston & Liora — The Lumin Archive Research Division

Login
Username:
Password:	Lost Password?
	Remember me