Beyond the Average

Moving from "Does it work?" to "Who does it work for?" with Heterogeneous Treatment Effects.

The Limit of the Average

Most experiments report the Average Treatment Effect (ATE), a single number summarizing a program's impact. But this average can hide the truth: interventions affect different people in different ways. Heterogeneous Treatment Effects (HTE) analysis uncovers this crucial variation.

Average Treatment Effect (ATE)

One effect for everyone.

+5% Effect

Conditional ATE (CATE)

Different effects for different subgroups.

Group A +15%
Group B +2%
Group C -3%

Case Study: The Moving to Opportunity Experiment

The MTO experiment offered housing vouchers to families in high-poverty areas. Initial analyses found a zero average effect on adult earnings. But a landmark HTE re-analysis revealed a profoundly different story based on the age of the children when they moved.

Long-Term Income Gains Varied Dramatically by Age at Move

This discovery transformed policy lessons, showing that the program was highly effective, but only when targeted at families with young children.

The HTE Methodological Toolkit

The methods for finding HTE have evolved from testing pre-planned hypotheses to exploring the data for unexpected patterns of effects.

1. Confirmatory Analysis

Start with a theory. Test specific, pre-registered hypotheses using interaction terms in a regression model. This is the gold standard for testing a theory.

Y = β₀ + β₁T + β₂S + β₃(T x S) + ε

2. Exploratory Discovery

Use machine learning to let the data reveal the most important subgroups. Methods like Causal Forests build models to find where the treatment effect differs most.

ΨΤ(x)

Perils and Best Practices

With great power comes great responsibility. HTE analysis requires rigor to avoid common pitfalls that can lead to false discoveries.

⚠️The Peril of "P-Hacking"

Testing many subgroups inflates the chance of finding a "significant" result purely by luck. This practice undermines the credibility of research.

Testing 10 subgroups at a 5% significance level can create a 40% chance of a false positive.

The Solution: Pre-Analysis Plans (PAPs)

The best defense is to pre-register your analysis plan. A PAP creates a clear, auditable line between planned, confirmatory tests and exploratory findings.

  • Specify hypotheses before analysis.
  • Define primary outcomes and models.
  • Plan for multiple comparisons.
  • Conduct power calculations for interactions.