Spring 2001

Some facts about the normal curve

** Purpose:** A bit of further explanation
about the normal curve and how to work with it.

** As explained in the text,** the normal curve is given by the following
equation:

We don't have to work directly with this function very often, so we'll just need to know about a few of its basic properties By one of the class exercises, 11.4.2:

- the values of the function are all > 0.
- the graph of this function is symmetric about the y-axis;

thus*f(x) = f(-x)*for all*x.* - the value of
*f(x)*becomes very small when*x*grows very large.

** The graph of the function** is shown in Figure 1 below.

Figure 1: the normal curve

** By definition,** a random variable is

- the curve is shifted so that its midpoint corresponds to the mean of our random variable,
- it is stretched
so that 1 unit of distance from the standard normal curve corresponds to
the standard deviation of our random variable,
*and* - the vertical
scale is adjusted so that the total area remains equal to 1.

We don't have to work directly with the equation here!!
Instead, we just have use Table 11.6 in chapter 11 of the text
(on page 254 in the July 17, 2001 version) to find the area under the curve
between the two corresponding z-values. So, in Figure 1 above, the
area of the yellow shaded region has to be
*A*(*z*_{2}) - *A*(*z*_{1}).
This area (being a number between
0 and 1) tells us ** the percentage of our
probability distribution** --

** A couple of instances.**In §11.5, we'll see that the
outcome of an independent trials process -- repeated a large number
of times -- is approximately normally distributed.
IQ scores provide another example, as shown in figure 2 below.

Figure 2: the distribution of IQ scores

This example illustrates the fact that the re-scaling is "almost invisible".
Namely, standard deviation is 20, so the difference between 120 and the mean
( = 100) is exactly 1 standard deviation unit. Therefore the area
-- shaded in magenta in Figure 2 -- between
the corresponding z-values (*z* = 0 and
*z* = 1) is A(1) = 0.34.
Hence, 34% of the population has IQ scores in this range.

*Working with normal distributions*

*Part I:Determining the percentage of the
population when a range of scores is given*

For instance, this is the situation if we're given some test score above the mean and we want to find what percentage of the people taking the test had scores above that level.*Finding the area to the right of a given*[positive]*z-value.*

For a specific numerical instance, suppose that in a test with a mean of 73 and a standard deviation of 6, we want to know how what percentage of the students had scores of 80 or above. (This is like the discussion at the bottom of page 253 or the text.) So there are 3 steps:- Find the z-value. We work with 79.5 instead of 80; subtract
the mean from this value, and then divide by the standard deviation:

*z*= = = 1.083.

- Look up the value of A(z) in the table. We round off
to
*z*= 1.1 and then look up:

A(1.1) = 0.3643.

This gives the area that's shaded with red stripes in the diagram.

- Subtract from 0.5 to find the area of the right-hand tail
of the distribution. Thus:

0.5 - A(1.1) = 0.5 - 0.3643 = 0.1357.

This area is shaded with green in the diagram.

Thus, about 13.6% of the students have scores of 80 or above.

Figure 3:*A(z)*and the area of the "tail"

- Find the z-value. We work with 79.5 instead of 80; subtract
the mean from this value, and then divide by the standard deviation:
Given our information about IQ scores, suppose we want to know what percentage of the people have IQ score between 75 and 125, inclusive. Assuming that our scores just take integer values, we work with 125.5 and 74.5 when finding the corresponding z-values.*Finding the area within a given distance of the mean.*- For a score of 125.5, we obtain:

*z*= = = 1.275.

For a score of 76.5, a similar calculation gives*z*= -1.275.

- Working with the positive value, we round off to
*z*= 1.3 and look in the table to find:

A(1.3) = 0.4032.

This gives the area that's shaded with green stripes in Figure 4.

- By the symmetry of the graph, the area on the other side (with orange
stripes) also has area = 0.4032. [Note that the z-value on the
other side is exactly the negative of the one that we just looked at.]
So, the total shaded area is:

0.4032 + 0.4032 = 0.8064.

Thus, about 80.6% of the people have IQ scores between 75 and 125, inclusive.

Figure 4: The region within a given distance of the mean

- For a score of 125.5, we obtain:
Once again, we'll illustrate this with a specific example. So, let's suppose that 1000 students have taken an exam, where 100 points is the maximum score, the mean is 73, and the standard deviation is 5. We'll ask how many students had scores of 80 or lower.*Finding the area to the left of a given*[positive]*z-value.*- To find the z-value, we work with 80.5, since this "splits the
difference" between 80 and 81. Here is our calculation:

*z*= = = 1.5.

- As usual, we look in Table 11.6 to find A(z):

A(1.5) = 0.4332.

This gives the area that's shaded with green stripes in Figure 5, corresponding to scores which are*above the mean but below the indicated z-value.*

- To find the total shaded area, we have to add the area to the left
of the midpoint -- shaded in turquoise in Figure 5. Since it's
of the entire area under the normal curve, this area is = 0.5. Hence, the total shaded area is:*exactly half*

0.5 + 0.4332 = .9332.

We conclude that 933 of the 1000 students had test scores of 80 or lower.

Figure 5: The area to the left of a given positive z-value

- To find the z-value, we work with 80.5, since this "splits the
difference" between 80 and 81. Here is our calculation:

*Part II:Determining the range of scores when a
percentage of the population is given*

Each of these problems is "inverse" to the corresponding problem in
Part I. Thus, in Part 1 we were given a range of scores and wanted to
find the percentage of the population whose scores are in that range.
In terms of Table 11.6, this meant that we calculated *z*
and then looked up A(z). Here; the situation is
turned around. We're given a percentage of the population, and we
want to find what range of scores corresponds to it. So, we start
by figuring out a value for A(z) and then looking in the
table to find the value of *z* which corresponds most
closely to it.

(This question makes sense only if the given percentage is*Finding the*[positive]*z-value such that a given percentage of the area under the normal curve lies to the right of that value.*__<__50%.)

This is inverse to the problem discussed in*Part IA,*so that you can refer to the same figure. In*Part IA*we were given the*z*-value, so we looked up the value of A(z) and then subtracted it from 0**.**5 to get the area of the tail -- shown with the light green [solid] shading in the figure. Here, we are given the area of the tail, so we do the steps in the opposite order,*namely:*- Subtract from 0
**.**5 to get the area under the normal curve between the midpoint (*z*= 0) and the horizontal line corresponding to the [still unknown]*z*-value. This number is the value of A(z) that we have to work with. - We now
*"read the table backwards"*to find the sought-for*z*-value. Thus, we look in theof Table 11.6 to find the number that's closest to the value of A(z) that we just calculated. Our*right-hand column**z*-value is on the same line in the left-hand column of the table.*For better accuracy,*if our value of A(z) is about halfway between two entries in the right hand column of table 11.6, then we can split the difference between the two corresponding*z*-values.

Let's determine the*An instance:**z*-value such that 10% of the population has scores above that value. This means that we have to take A(z) =**.**5 -**.**1 =**.**4. Looking in the table, the closest value is A(z) =**.**4032, and this corresponds to*z*= 1**.**3.we still have to re-scale and shift. For instance in the case of IQ scores, with mean = 100 and s.d. = 20, a*In an applied problem,**z*-value of 1**.**3 corresponds to an IQ score which is 1**.**3·20 = 26 points above the mean, and thus to a score of 100 + 26 = 126.

- Subtract from 0
*Finding the*[positive]*z-value such that a given percentage of the area under the normal curve lies within**z*standard deviation units of the mean.

This is inverse to the problem discussed in*Part IB,*so that you can refer to the same figure. In*Part IB*we were given the*z*-value, so we looked up the value of A(z) and then multiplied it 2 to get the area under the the middle part of the bell curve -- shown as the total of the two shaded areas in the figure. Here, we are given the area of the symmetric middle part, so that we have to reverse the order of the steps and appropriately substitute division for multiplication.*Thus:*- Divide the given
area by 2, in order to find the value of A(z).*symmetric* - Look for this value in the right-hand column of Table 11.6. Our
answer is then the
*z*-value on the same line of the table.suppose that we want to determine the IQ scores that characterize the middle*For instance,*^{2}/_{3}of the population. Then we take A(z) =^{1}/_{3}( =^{2}/_{3}**.**^{1}/_{2}). Looking in the table, the two closest values are 0**.**3159 and 0**.**3413. These correspond to*z*= 0**.**9 and*z*= 1**.**0 respectively. Accordingly, our answer is about halfway between and thus corresponds to*z*= 0**.**95. The actual range is between 19 points ( = 0**.**95**·**20) below the mean and 19 points above the mean,*i.e.,*IQ scores between 81 and 119.

- Divide the given
(This question makes sense only if the given percentage is*Finding the*[positive]*z-value such that a given percentage of the area under the normal curve lies to the left of that value.*__>__50%.)

Look at the figure in*Part IC*to help with understanding this case. We have to subtract 0**.**5 from our given area in order to find the value of A(z) to work with. For instance, if we want to find the score corresponding to the 85^{th}percetile, then we take A(z) = 0**.**85 - 0**.**5 = 0**.**35. (Etc., by analogy with the other cases**...**)

Back to the class homepage