Heterogeneity (I²)

Methodology Published Apr 29, 2026

Heterogeneity (I²)

I² is the percent of study-to-study disagreement in a meta-analysis that likely reflects real differences, not just random noise.

Also known as

I-squared · I2 statistic · I² statistic · heterogeneity I² · inconsistency statistic

Why this matters

A pooled result can look precise while the underlying studies are pulling in different directions. If you miss a high I² value, you can treat one average number as universal when it may only be a blurry compromise across different populations, doses, or study designs.

4 min read · 853 words · 3 sources · evidence: robust

Deep dive

How it works

I² is derived from Cochran's Q, which compares the observed spread of study estimates with the spread expected from within-study sampling error alone. Because Q is influenced by the number and precision of studies, I² is not perfectly stable in very small meta-analyses; that is one reason expert guidance recommends interpreting it alongside the forest plot, confidence intervals, and the clinical similarity of the studies, not in isolation.

When you'll see this

The term in the wild

Scenario

You open a meta-analysis on omega-3 supplements and C-reactive protein, and the forest plot reports I² = 78%.

What to notice

That does not mean the result is wrong. It means the trials differ enough that the pooled average may be blending unlike situations—different doses, different baseline inflammation, or different study lengths.

Why it matters

Instead of quoting one average effect as universal, you know to check subgroup analyses before deciding how relevant it is to your use case.

Scenario

A clinician journal club reviews a vitamin D meta-analysis with I² = 12%.

What to notice

An I² this low suggests the studies are giving fairly similar answers relative to random variation. The pooled estimate is more likely to describe a common pattern across trials.

Why it matters

The group can spend less time worrying about inconsistency and more time discussing magnitude, certainty, and applicability.

Scenario

You use RevMan or another meta-analysis program and see Q, df, and I² reported together.

What to notice

This is the practical version of a heterogeneity i² calculator. The software uses the Q statistic and degrees of freedom to estimate how much of the spread is likely real between-study inconsistency.

Why it matters

You do not need to compute it manually to interpret it correctly.

Key takeaways

  • I² measures inconsistency across studies, not effect size.
  • A low I² means the pooled estimate is describing fairly similar studies; a high I² means the average may hide important differences.
  • Common rough bands come from the Cochrane Handbook, but they are not hard pass-fail cutoffs.
  • The usual formula is I² = 100% × (Q − df) / Q, then bounded at 0% to 100%.
  • High heterogeneity should trigger a search for why studies differ, not an automatic dismissal of the meta-analysis.

The full picture

When the diamond looks tidy but the studies do not

One of the strangest moments in a meta-analysis is seeing a clean pooled estimate at the bottom of a forest plot while the individual studies above it are scattered all over the place. That is the trap Heterogeneity (I²) was built to expose. The bottom-line diamond can look calm even when the studies are not really telling the same story.

It is not “how big the effect is”

Here is the surprise: I² does not measure whether a treatment works a lot or a little. It measures how much the studies disagree with each other beyond what you would expect from chance alone.

Picture a choir trying to hold one note. If the voices cluster tightly, the average note represents the group well. If the voices drift apart, the average note still exists, but it stops sounding like any real singer. That is what high heterogeneity means in a meta-analysis: the pooled average may be mathematically correct while being a poor summary of what happened across studies.

So what does i2 represent? In plain language, it is the share of the total variation across study results that likely reflects real between-study differences rather than random sampling error.

What the numbers usually mean

I² is reported from 0% to 100%. A value near 0% means little observed inconsistency. Higher values mean more disagreement. The Cochrane Handbook gives rough, context-dependent guideposts: 0%-40% may not be important, 30%-60% may represent moderate heterogeneity, 50%-90% may represent substantial heterogeneity, and 75%-100% may represent considerable heterogeneity. These bands overlap on purpose because interpretation depends on the size and direction of effects, the populations studied, and how precise the studies are.

That is why there is no universal answer to “what is a good I2 heterogeneity?” Lower is easier to summarize, but “good” depends on context. In nutrition and supplement research, where trials often use different doses, durations, and participant groups, some heterogeneity is normal.

How do you calculate with I²?

The usual heterogeneity i² formula is based on Cochran’s Q: I² = 100% × (Q − df) / Q, where df means degrees of freedom, usually the number of studies minus one. If that math produces a negative number, it is set to 0% in practice. You do not usually calculate it by hand unless you are doing the meta-analysis yourself; most software reports it automatically.

What high heterogeneity should make you do next

If heterogeneity i² meaning is “the studies are not lining up neatly,” then the next move is not to throw the meta-analysis away. It is to ask why. Different doses, different baseline health status, different outcome definitions, short vs long trials, or higher risk of bias can all widen the spread.

One useful decision today: if you are reading a supplement meta-analysis and see I² above about 75%, do not stop at the pooled estimate. Look immediately for subgroup analyses or sensitivity analyses. If the paper cannot explain the spread, treat the average effect as a rough sketch, not a universal rule.

Myths vs reality

What people get wrong

Myth

A high I² means the treatment does not work.

Reality

No. It means the studies disagree. A treatment can show benefit overall and still have high heterogeneity if the effect changes across doses, populations, or trial designs.

Why people believe this

Readers often collapse two separate questions into one: 'Is there an effect?' and 'Are the studies consistent?' I² answers the second, not the first.


Myth

There is a single good or bad I² cutoff.

Reality

I² is more like a weather report than a pass-fail stamp. The same number can matter differently depending on effect size, precision, and whether the studies are clinically similar.

Why people believe this

The Cochrane Handbook's rough bands are widely taught and useful, but they were never meant to be rigid thresholds for every topic.


Myth

I² tells you how much patients vary inside a study.

Reality

It is about how study results vary between studies in a meta-analysis. It does not describe how different the individual participants were within each trial.

Why people believe this

The word 'heterogeneity' sounds broad, so people import its everyday meaning instead of its meta-analysis meaning.

How to use this knowledge

A common failure mode is treating high I² as a reason to average harder—by repeating the pooled number more confidently. Do the opposite: when heterogeneity is substantial, anchor your decision to the subgroup closest to your situation rather than the grand average.

Frequently asked

Common questions

What does high heterogeneity in a meta-analysis tell you?

It means the studies are giving meaningfully different answers beyond what random chance would usually explain. The pooled estimate may still be useful, but you should look for reasons the studies differ before treating the average as universal.

How is the I² statistic calculated?

The standard formula is I² = 100% × (Q − df) / Q, where Q is Cochran's heterogeneity statistic and df is usually the number of studies minus one. In practice, meta-analysis software calculates it for you.

Can I² be negative?

The raw formula can produce a negative value in some cases, but it is conventionally set to 0%. Negative heterogeneity is not interpreted as a real state.

Does a low I² prove the studies are identical?

No. It only suggests little observed statistical inconsistency. Studies can still differ in clinically important ways that I² does not fully capture.

Why is I² common in supplement research papers?

Supplement trials often vary in dose, formulation, study length, baseline nutrient status, and outcome definitions. I² gives readers a quick sense of how much that variation is showing up in the pooled results.

Want personalized recommendations?

Show me what works for me