A visual guide to the foundations of causal inference—moving from simply seeing a pattern to proving a cause.
For any person, we can only observe one reality. The alternative—the **counterfactual**—is forever hidden.
Person A
Outcome is Observed
Outcome is Unobserved
Since we can't see individual effects, we estimate the **Average Treatment Effect (ATE)** across a population.
E [Y(1)] - E [Y(0)]
Average Outcome (Treated) - Average Outcome (Control)
Select a representative group from the population.
(External Validity)
Split the sample into two identical-on-average groups.
(Internal Validity)
This ensures the only difference between groups is the treatment itself.
Without randomization, groups often form based on pre-existing traits, leading to biased results.
🔵 = Motivated, ⚪️ = Not Motivated
Mostly motivated people self-select.
Groups are not comparable.
No interference between units, and the treatment is consistent for all.
Treatment assignment is independent of potential outcomes (true by design in an RCT).
For any group, there's a non-zero chance of receiving or not receiving the treatment.
(For IVs) An instrument affects the outcome ONLY through the treatment.