Cohen's d

Methodology Published Apr 22, 2026

Cohen's d

Cohen’s d tells you how far apart two group averages are in real-world spread, not just whether a difference technically exists.

Also known as

Cohen d · Cohen's d effect size · standardized mean difference · SMD · effect size d

Why this matters

A study can find a “statistically significant” result that is too small to matter in practice. Cohen’s d helps you see whether a difference is tiny, noticeable, or dramatic when you compare supplements, training plans, or any two groups in a study.

4 min read · 865 words · 4 sources · evidence: robust

Deep dive

How it works

Cohen’s d is usually computed as the difference between two means divided by a standard deviation, often the pooled within-group standard deviation. That standardization is what makes it unitless. In small samples, Cohen’s d is slightly upward-biased, which is why meta-analyses often prefer Hedges’ g, a closely related corrected version.

When you'll see this

The term in the wild

Scenario

You read a creatine monohydrate study where the supplement group gains more strength than placebo, and the paper reports Cohen’s d = 0.62.

What to notice

That does not mean a 62% improvement. It means the two groups are separated by a little over half of their usual person-to-person variation—a moderate effect by the common rule of thumb.

Why it matters

This keeps you from confusing standardized effect size with percent change, a very common reading mistake.

Scenario

A sleep supplement trial reports p < 0.05, but Cohen’s d = 0.12 for sleep duration.

What to notice

The study found a detectable difference, but the size of that difference is tiny relative to normal variation in sleep length.

Why it matters

You may decide the result is real but not worth changing your routine for.

Scenario

A meta-analysis lists standardized mean differences instead of raw units because the included studies measured the same idea in different ways.

What to notice

Researchers use a standardized effect so results from different scales can be compared or combined more fairly.

Why it matters

You can understand why one pooled result is reported even when individual papers used different questionnaires or performance tests.

Key takeaways

  • Cohen’s d measures the size of a difference between two group averages relative to how spread out the data are.
  • A p-value answers “is there evidence of a difference?”; Cohen’s d answers “how big is the difference?”
  • The familiar 0.2 / 0.5 / 0.8 cutoffs are rough conventions, not universal rules.
  • A d above 1 means the groups are separated by more than one typical spread, usually a strong difference.
  • For practical reading, effect size is often more useful than significance alone when comparing interventions.

The full picture

When the p-value says “yes” but your eyes should say “how much?”

A common research-paper trap happens right after the words statistically significant. Readers are trained to stop there, as if significance settles the story. It does not. A huge study can make a trivial difference look impressive, while a small study can miss a meaningful one. That gap is exactly why Cohen’s d effect size became so useful: it answers the question the p-value does not—how big is the difference?

Picture two choirs singing the same note. If their voices overlap almost completely, the groups are basically similar. If one choir’s sound sits clearly higher than the other, the difference is obvious even before you measure it. Cohen’s d turns that overlap into a number. Formally, it is the difference between two group means divided by their typical spread, usually a pooled standard deviation.

That is the surprise: Cohen’s d is not measuring raw units like pounds lifted, milliseconds, or blood levels. It rescales the difference into shared “group spread” units so you can judge magnitude across very different outcomes. A d of 0.5 means the two groups are separated by about half of one standard deviation. In plain English, the average person in one group is moderately shifted away from the average person in the other.

This is why Cohen’s d effect size interpretation is more portable than a raw mean difference. Ten milliseconds might be huge in sprinting and irrelevant in sleep duration. But a standardized gap lets you compare how separated the groups are relative to natural variation.

People often learn the rough guide: 0.2 small, 0.5 medium, 0.8 large. Those ranges are useful training wheels, not laws of nature. In a noisy field, 0.3 may matter. In another context, even 0.8 might not change a real decision. Still, the rule helps answer common questions: a Cohen’s d of 0.5 is usually described as a medium effect, and 1.2 is typically considered large—very large, in many practical settings. If Cohen’s d is greater than 1, the group means are more than one full typical spread apart, which usually signals a strong separation with less overlap.

You will see this idea appear in papers as Cohen’s d, standardized mean difference, or in meta-analyses as closely related versions such as Hedges’ g. If you are using a Cohen’s d calculator or running Cohen’s d in SPSS, the software is doing the same core move: mean difference divided by variability.

The one decision it helps you make

When two studies both say a supplement “worked,” do not first ask which p-value is smaller. Ask which study shows the larger, more believable effect size in a population like yours. That one move shifts you from Is there a difference? to Is the difference big enough to care about?

Cohen’s d will not tell you everything. It does not prove quality, remove bias, or replace judgment about outcomes. But it keeps you from mistaking a barely detectable nudge for a meaningful shift—and that is one of the most common reading errors in statistics.

Myths vs reality

What people get wrong

Myth

A statistically significant result automatically means a large effect.

Reality

Significance is about how confidently a study detects a difference; Cohen’s d is about how far apart the groups actually are. A tiny effect can be significant in a large sample.

Why people believe this

Intro statistics teaching often centers hypothesis testing first, so readers are trained to treat the p-value like the whole verdict.


Myth

A Cohen’s d of 0.5 means the treatment worked by 50%.

Reality

Cohen’s d is not a percentage. It is a standardized distance measured in units of typical spread.

Why people believe this

The number looks like a proportion, and effect-size dashboards or calculator outputs often display it without a plain-language explanation.


Myth

0.2, 0.5, and 0.8 are fixed truth labels for small, medium, and large in every field.

Reality

Those are rough benchmarks, not natural laws. Whether an effect matters depends on the outcome, the stakes, and the typical variability in that area.

Why people believe this

Jacob Cohen introduced these as conventional benchmarks for use when better field-specific guidance was absent, but they are often repeated as if they were universal cutoffs.

How to use this knowledge

If you are comparing studies in athletes or experienced lifters, be careful with near-miss comparisons across different populations. A “medium” Cohen’s d in untrained adults may not translate into a meaningful edge for trained people, because the baseline variability and practical stakes are different.

Frequently asked

Common questions

What information does a Cohen’s d effect size give you?

It tells you how far apart two group averages are after scaling that gap by the groups’ usual spread. In practice, it helps you judge whether a difference is tiny, moderate, or large rather than merely statistically detectable.

How should a Cohen’s d of 0.5 be interpreted?

By the usual convention, 0.5 is a medium effect size. It means the group averages are separated by about half of one typical spread of the data, so the groups overlap less than they would with a trivial effect.

What does it mean when Cohen’s d exceeds 1?

That means the two group means are more than one standard deviation apart, which usually indicates a strong difference with less overlap between groups. It does not guarantee practical importance, but it is commonly read as a large effect.

Would 1.2 be considered a large effect size?

Yes—under the common interpretation, 1.2 is large, and often very large in practical terms. Still, the real meaning depends on the outcome and context, not the number alone.

When should I use Cohen’s d instead of raw mean difference?

Use Cohen’s d when you want a unitless measure of group separation, especially when comparing results across studies or outcomes that use different scales. Raw mean difference is often better when the original units themselves are easy to understand and decision-relevant.

Want personalized recommendations?

Show me what works for me