Normal distribution
Contents
5.6. Normal distribution#
The mean and the variance (the standard deviation squared) are important for characterizing the normal distribution. This is one of the most important distributions in statistics, in part because it is empirically common—a lot of things are normally distributed in nature. But it is also important for the related reason that we have learned a lot about it. That knowledge of the normal facilitates a great deal of statistical reasoning, in terms of assumptions we make and inference we draw.
First, notice that the normal goes by many other names: it is sometimes referred to as a Gaussian distribution (named after Gauss). It is sometimes referred to as the “bell curve”, because it looks like a bell—though this is imprecise and may lead to confusion (a lot of distributions look like a bell).
Second, as noted above, the normal distribution is characterized by its mean and variance (equivalently, its standard deviation). It is unimodal (has one mode) and that mode is equal to its median, which is equal to its mean. So it is symmetric.
The mean tells us the “location” of the normal in terms of its base on the
Though you don’t need to know this for the course, here is the formula for the normal distribution. See if you can see the mean and standard deviation in it:
As an example of a normal distribution, consider one with mean

Fig. 5.18 Normal distribution with mean
Now, let’s add (in blue) a normal distribution with the same mean, but a large standard deviation (

Fig. 5.19 A second normal distribution in blue with mean
Finally, we will add (in green) another normal, with a standard deviation of 2, but with a mean of 2. Note how it shifts up the

Fig. 5.20 A third normal distribution in green with mean
For convenience, a particular normal is sometimes written as
As we hinted above, many many things are normally distributed in nature, like human heights or birth weights.
Features of the normal: the “empirical rule”#
The symmetry of the normal distribution, along with some other important features, leads to the Empirical Rule. This rule states that if a variable is normally distributed…
approximately 68% of the observations will be between one standard deviation above and below the mean…
approximately 95% of the observations will be between two standard deviations above and below the mean…
approximately 99% of the observations will be between three standard deviations above and below the mean.
This rule is shown graphically below:

Fig. 5.21 Empirical rule#
To understand the Empirical Rule, notice that the mean (
To give an example, suppose we have women’s heights, which we believe to be normally distributed. The mean of those heights is
It says:
approximately 68% of the observations will be between one standard deviation above and below the mean…
Well, one standard deviation above the mean is
approximately 95% of the observations will be between two standard deviations above and below the mean…
Well, two standard deviations above the mean is
approximately 99% of the observations will be between three standard deviations above and below the mean.
Well, three standard deviations above the mean is
In a sense, it is not surprising that almost all women’s heights are between 4 foot 6 and a half inches and almost six foot, four inches. But this is nonetheless quite informative about how a variable is distributed, how likely we are to see a case above or below a certain point, and so on.
-Scores#
We can make the idea of the Empirical Rule more general. Rather than focusing on “whole numbers” of standard deviations (so, 1 or 2 or 3), we can give every observation a score in terms of where it lies relative to the mean of the normal distribution from which it was drawn.
We will define the
the value of the observation minus the mean of the distribution, divided by the standard deviation of the distribution.
That is:
where:
is the value of the observation (the specific height, or weight, or income or whatever) is the value of the mean of the normal distribution from which it is drawn is the value of the standard deviation of the normal distribution from which it is drawn
The intuition is that, but converting the observation to a
Men’s weights as -Scores#
Suppose that the mean weight of a man in the US is 191 pounds (
176 lbs. Then:
302 lbs. Then:
80 lbs. Then:
Notice that the second (302 lbs) and third (80 lbs) men were equally far from the mean, but above and below it respectively—in both cases, they are 2.58 standard deviations away.
Standard Normal#
When we subtract the mean and divide by the standard deviation, we are standardizing our distribution. Whatever normal distribution we start with, whatever its mean and variance, we will end up with a normal distribution that has:
mean equal to zero (
)standard deviation (and variance) equal to one (
)
This distribution is called the standard normal, and is written as
The standard normal look as we would expect (it is centered at zero) and, in fact, we met it already above:

Fig. 5.22 Standard normal#
To reiterate: the distribution of
The -distribution and probabilities#
Converting observations to
To make this more formal, consider the following question:
what proportion of women are taller than 5 feet 9 inches?
This is the same as asking what proportion of women have heights in excess of 69 inches, which itself is the same as asking what the probability of a random woman being over 69 inches tall.
In
So we want to know the probability of observing
To understand this problem graphically, consider the following figure:

Fig. 5.23 Observation with
We have an observation with
To calculate this, we can ask Python “for a standard normal distribution, what is the size of the area to the right of norm.cdf
from stats
in scipy
:
from scipy import stats
1-stats.norm.cdf(1.14)
0.12714315056279824
The “1” here captures the fact that all distributions sum to one (the probability of observing all the possible heights must be one in total). We subtract the 1.14 to tell us how much of the distribution remains once we remove those folks below that height.
This comes out to a probability of around 0.127; equivalently, that 12.7% of women are taller than 5’9”.
Let’s consider a different problem. Suppose we want to know what proportion of women are under 5’3”. This is equivalent to asking what proportion of women have a

Fig. 5.24 Observation with
Now, we can simply ask for:
stats.norm.cdf(-.57)
0.28433884904632417
This works, because norm.cdf
is set up to tell us what proportion of the distribution is below the argument we gave it (a
Finally, suppose we want to know the proportion of observations between two values. Let’s say, the proportion of women between 5’4” and 5’8”. We need the proportion of the normal between a

Fig. 5.25 Observations with
There are several different ways to proceed here. Perhaps the simplest is to request the area above the higher number (so,
1- (1- stats.norm.cdf(0.86) ) - stats.norm.cdf(-0.28)
0.4153667263039888
Or around 41.4%. This implies that a random draw will select a woman between 5’4” and 5’8” with probability 0.415.