The Normal (Gaussian) Distribution appears everywhere: the heights of people, the errors in scientific measurements, even the scores on standardized tests. It is the most important continuous distribution in statistics.
Parameters and Properties
- (Mean): The center of the distribution (determines location).
- (Standard Deviation): The spread of the distribution (determines width).
- Variance ():
- Support:
The reason the Normal distribution is so ubiquitous is the Central Limit Theorem. It states that if you add up a large number of independent random variables (even if they aren't normally distributed themselves!), their sum will approximate a Normal distribution. Since many natural phenomena (like human height) are the result of adding up countless tiny independent factors (thousands of genetic and environmental variables), the final result follows a perfect bell curve.
The Empirical Rule (68-95-99.7)
For any normal distribution, you can predict exactly where the data lives:
- 68% of outcomes are within .
- 95% are within .
- 99.7% are within . Anything beyond 3 standard deviations is extremely rare, often called a "six-sigma" event in industrial quality control.
The Standard Normal Distribution
Every Normal distribution is just a shifted and scaled version of the Standard Normal Distribution, denoted as . This fundamental curve has a mean of 0 () and a standard deviation of 1 ().
We can convert any normal variable into a standard -score using the following transformation:
This Z-score tells you exactly how many standard deviations away from the mean your value is. A Z-score of +1.5 means the value is 1.5 standard deviations above average.
The Standard Normal Curve (Z)
The 'Z-distribution' is the standard benchmark where mean is 0 and variance is 1. Any normal variable X can be converted to Z using Z = (X-μ)/σ.
Advanced Practice
Example 1: IQ Scores
IQ scores are normally distributed with a mean of 100 () and a standard deviation of 15 (). What percentage of the population has an IQ between 85 and 130?
Example 2: Quality Control Rejects
A machine fills bottles with exactly 500ml of liquid, but with a normally distributed error. The standard deviation is 4ml (). Bottles with less than 490ml are rejected. What must the machine's target mean setting () be so that exactly 1% of bottles are rejected? (Assume a Z-score of -2.33 corresponds to the bottom 1%).