The Goal
We want to understand a whole **population**, but we can only study a small **sample**. Statistical inference is the science of using that sample data to make educated guesses about the population.
Population
(Everyone)
Sample
(A Small Group)
The Bridge: Central Limit Theorem
If we take many random samples and plot their average values, they form a predictable bell curve called a **sampling distribution**. This allows us to use the properties of the normal distribution to make inferences.
The Framework: Hypothesis Testing
This is the formal process for testing a claim.
1️⃣
State Hypotheses
Define the Null (H₀, no effect) and Alternative (Hₐ, an effect) hypotheses.
2️⃣
Set the Standard
Choose a significance level (α), usually 5% (0.05).
3️⃣
Analyze Data
Calculate a test statistic from your sample data.
4️⃣
Make a Decision
Compare your result (p-value) to your standard (α).
The Verdict
The **p-value** is the probability of seeing your data if the null hypothesis is true. We compare it to alpha (α) to make a decision.
IF p-value ≤ α
💥
Reject the Null Hypothesis
(The result is statistically significant)
IF p-value > α
🤷
Fail to Reject the Null
(The result is not statistically significant)
The Risks: Errors
Since we're dealing with probability, we can make two kinds of mistakes.
| Actual Reality | ||
|---|---|---|
| H₀ is True | H₀ is False | |
| Our Decision | Type I Error False Positive | Correct! True Positive |
| Reject H₀ | Correct! True Negative | Type II Error False Negative |
The Uncertainty
A **confidence interval** gives a range of plausible values for the true population parameter, quantifying the uncertainty around our sample estimate.
Sample Mean: 105
95% Confidence Interval: [99, 111]
We are 95% confident the true population mean lies between 99 and 111.