The Analyst's Guide to Cause & Effect

Moving beyond "what happened" to "why it happened" with powerful causal inference methods.

The Core Problem

The most common trap is **confounding**, where a hidden "third variable" causes two other variables to move together, creating a spurious correlation.

Weather (Confounder)
Ice Cream Sales
Crime Rate
🏆

The Gold Standard: RCTs

Randomization is the most powerful tool. It creates two groups that are, on average, identical, breaking the links to any potential confounders. Any difference in outcome can then be attributed to the treatment.

👥

Population

🪙

Treatment Group

Control Group

📈

Difference-in-Differences

Compares the change in outcomes over time between a treated group and an untreated group. Relies on the "parallel trends" assumption.

Regression Discontinuity

Used when a treatment is assigned by a sharp cutoff. It compares people just above and below the cutoff, assuming they are otherwise identical.

🛠️

More Tools

Instrumental Variables (IV)

Uses a third variable (the instrument) that affects treatment choice but not the outcome directly, isolating a sliver of "as-if random" assignment.

Propensity Score Matching (PSM)

Creates a comparable control group by matching treated individuals to untreated individuals who had a similar likelihood (propensity) of being treated.

📜

The Unspoken Rules

All causal claims from non-experimental data rely on strong, untestable assumptions. These must be justified with domain knowledge.

🤝

SUTVA

No interference between units and no hidden versions of the treatment.

🔍

Unconfoundedness

All variables that affect both treatment and outcome have been measured and controlled for.

Positivity

For any type of person, there is some chance of being in either the treatment or control group.