Chapter 6. The Normal Distribution

6.1. The Normal Distribution*

Student Learning Objectives

By the end of this chapter, the student should be able to:

  • Recognize the normal probability distribution and apply it appropriately.

  • Recognize the standard normal probability distribution and apply it appropriately.

  • Compare normal probabilities by converting to the standard normal distribution.

Introduction

The normal, a continuous distribution, is the most important of all the distributions. It is widely used and even more widely abused. Its graph is bell-shaped. You see the bell curve in almost all disciplines. Some of these include psychology, business, economics, the sciences, nursing, and, of course, mathematics. Some of your instructors may use the normal distribution to help determine your grade. Most IQ scores are normally distributed. Often real estate prices fit a normal distribution. The normal distribution is extremely important but it cannot be applied to everything in the real world.

In this chapter, you will study the normal distribution, the standard normal, and applications associated with them.

Optional Collaborative Classroom Activity

Your instructor will record the heights of both men and women in your class, separately. Draw histograms of your data. Then draw a smooth curve through each histogram. Is each curve somewhat bell-shaped? Do you think that if you had recorded 200 data values for men and 200 for women that the curves would look bell-shaped? Calculate the mean for each data set. Write the means on the x-axis of the appropriate graph below the peak. Shade the approximate area that represents the probability that one randomly chosen male is taller than 72 inches. Shade the approximate area that represents the probability that one randomly chosen female is shorter than 60 inches. If the total area under each curve is one, does either probability appear to be more than 0.5?

The normal distribution has two parameters (two numerical descriptive measures), the mean ( μ ) and the standard deviation ( σ ). If X is a quantity to be measured that has a normal distribution with mean ( μ ) and the standard deviation ( σ ), we designate this by writing

NORMAL: X ~N(μ, σ)

Empty normal distribution curve.

The probability density function is a rather complicated function. Do not memorize it. It is not necessary.

The cumulative distribution function is P ( X < x ) It is calculated either by a calculator or a computer or it is looked up in a table

The curve is symmetrical about a vertical line drawn through the mean, μ. In theory, the mean is the same as the median since the graph is symmetric about μ. As the notation indicates, the normal distribution depends only on the mean and the standard deviation. Since the area under the curve must equal one, a change in the standard deviation, σ, causes a change in the shape of the curve; the curve becomes fatter or skinnier depending on σ. A change in μ causes the graph to shift to the left or right. This means there are an infinite number of normal probability distributions. One of special interest is called the standard normal distribution.

Glossary

Normal Distribution

A continuous random variable (RV) with pdf , where μ is the mean of the distribution and σ is the standard deviation. Notation: X ~ N (μ, σ). If μ = 0 and σ = 1, the RV is called the standard normal distribution.

6.2. The Standard Normal Distribution*

The standard normal distribution is a normal distribution of standardized values called z-scores . A z-score is measured in units of the standard deviation. For example, if the mean of a normal distribution is 5 and the standard deviation is 2, the value 11 is 3 standard deviations above (or to the right of) the mean. The calculation is:

(6.1) x  =  μ  +  ( z ) σ  =  5  +  ( 3 ) ( 2 )  =  11

The z-score is 3.

The mean for the standard normal distribution is 0 and the standard deviation is 1. The transformation

produces the distribution Z ~ . The value x comes from a normal distribution with mean μ and standard deviation σ .

Glossary

Standard Normal Distribution

A continuous random variable (RV) X~N(0,1).. When X follows the standard normal distribution, it is often noted as Z~N(0,1).

z-score

The linear transformation of the form . If this transformation is applied to any normal distribution X~N( μ , σ) , the result is the standard normal distribution Z~N(0,1). If this transformation is applied to any specific value x of the RV with mean μ and standard deviation σ , the result is called the z-score of x . Z-scores allow us to compare data that are normally distributed but scaled differently.

6.3. Z-scores*

If X is a normally distributed random variable and X ~N(μ, σ), then the z-score is:

(6.2)

The z-score tells you how many standard deviations that the value x is above (to the right of) or below (to the left of) the mean, μ . Values of x that are larger than the mean have positive z-scores and values of x that are smaller than the mean have negative z-scores. If x equals the mean, then x has a z-score of 0.

Example 6.1. 

Suppose X ~ N(5, 6). This says that X is a normally distributed random variable with mean μ = 5 and standard deviation σ = 6. Suppose x = 17. Then:

(6.3)

This means that x = 17 is 2 standard deviations (2σ) above or to the right of the mean μ = 5. The standard deviation is σ = 6.

Notice that:

(6.4)

Now suppose x=1. Then:

(6.5)

This means that x = 1 is 0.67 standard deviations (- 0.67σ) below or to the left of the mean μ = 5. Notice that:

5 + ( -0.67 ) ( 6 ) is approximately equal to 1 (This has the pattern μ + ( -0.67 ) σ = 1  )

Summarizing, when z is positive, x is above or to the right of μ and when z is negative, x is to the left of or below μ .


Example 6.2. 

Some doctors believe that a person can lose 5 pounds, on the average, in a month by reducing his/her fat intake and by exercising consistently. Suppose weight loss has a normal distribution. Let X = the amount of weight lost (in pounds) by a person in a month. Use a standard deviation of 2 pounds. X ~N(5, 2). Fill in the blanks.

Problem 1. (Go to Solution)

Suppose a person lost 10 pounds in a month. The z-score when x = 10 pounds is z = 2.5 (verify). This z-score tells you that x = 10 is ________ standard deviations to the ________ (right or left) of the mean _____ (What is the mean?).


Problem 2. (Go to Solution)

Suppose a person gained 3 pounds (a negative weight loss). Then z = __________. This z-score tells you that x = -3 is ________ standard deviations to the __________ (right or left) of the mean.


Suppose the random variables X and Y have the following normal distributions: X ~ N(5, 6) and Y ~ N(2, 1). If x = 17, then z = 2. (This was previously shown.) If y = 4, what is z ?

(6.6)

The z-score for y = 4 is z = 2. This means that 4 is z = 2 standard deviations to the right of the mean. Therefore, x = 17 and y = 4 are both 2 (of their) standard deviations to the right of their respective means.

The z-score allows us to compare data that are scaled differently. To understand the concept, suppose X ~ N(5, 6) represents weight gains for one group of people who are trying to gain weight in a 6 week period and Y ~ N(2, 1) measures the same weight gain for a second group of people. A negative weight gain would be a weight loss. Since x = 17 and y = 4 are each 2 standard deviations to the right of their means, they represent the same weight gain in relationship to their means.


Solutions to Exercises

Solution to Exercise 1. (Return to Problem)

This z-score tells you that x = 10 is 2.5 standard deviations to the right of the mean 5.


Solution to Exercise 2. (Return to Problem)

z = -4. This z-score tells you that x = -3 is 4 standard deviations to the left of the mean.


6.4. Areas to the Left and Right of x*

The arrow in the graph below points to the area to the left of x . This area is represented by the probability P ( X < x ) . Normal tables, computers, and calculators provide or calculate the probability P ( X < x ) .

Normal distribution curve with a x value on the x-axis. The x-axis is equal to X. A vertical upward line extends from point x to the curve and the probability area occurs from the beginning of the curve to point x.

The area to the right is then P ( X > x ) = 1 – P ( X < x ) .

Remember, P ( X < x ) = Area to the left of the vertical line through x .

P ( X > x ) = 1 – P ( X < x ) = . Area to the right of the vertical line through x

P ( X < x ) is the same as P ( Xx ) and P ( X > x ) is the same as P ( Xx ) for continuous distributions.

6.5. Calculations of Probabilities*

Probabilities are calculated by using technology. There are instructions in the chapter for the TI-83+ and TI-84 calculators.

Note

In the Table of Contents for Collaborative Statistics, entry 15. Tables has a link to a table of normal probabilities. Use the probability tables if so desired, instead of a calculator.

Example 6.3. 

If the area to the left is 0.0228, then the area to the right is 1 – 0.0228 = 0.9772 .


Example 6.4. 

The final exam scores in a statistics class were normally distributed with a mean of 63 and a standard deviation of 5.

Problem 1.

Find the probability that a randomly selected student scored more than 65 on the exam.

Solution

Let X = a score on the final exam. X ~ N ( 63 , 5 ) , where μ = 63 and σ = 5

Draw a graph.

Then, find P ( X > 65 ) .

P ( X > 65 ) = 0.3446 (calculator or computer)

Normal distribution curve with values of 63 and 65. A vertical upward line extends from point 65 to the curve. The probability area from point 65 to the end of the curve is equal to 0.3446.

The probability that one student scores more than 65 is 0.3446.

Using the TI-83+ or the TI-84 calculators, the calculation is as follows. Go into 2nd DISTR.

After pressing 2nd DISTR, press 2:normalcdf.

The syntax for the instructions are shown below.

normalcdf(lower value, upper value, mean, standard deviation) For this problem: normalcdf(65,1E99,63,5) = 0.3446. You get 1E99 ( = 1099 ) by pressing 1, the EE key (a 2nd key) and then 99. Or, you can enter 10^99 instead. The number 1099 is way out in the right tail of the normal curve. We are calculating the area between 65 and 1099 . In some instances, the lower number of the area might be -1E99 ( = -1099 ). The number -1099 is way out in the left tail of the normal curve.

Historical Note

The TI probability program calculates a z-score and then the probability from the z-score. Before technology, the z-score was looked up in a standard normal probability table (because the math involved is too cumbersome) to find the probability. In this example, a standard normal table with area to the left of the z-score was used. You calculate the z-score and look up the area to the left. The probability is the area to the right.

. Area to the left is 0.6554. P ( X > 65 ) = P ( Z > 0.4 ) = 1 – 0.6554 = 0.3446



Problem 2.

Find the probability that a randomly selected student scored less than 85.

Solution

Draw a graph.

Then find P ( X < 85 ) . Shade the graph. (calculator or computer)

The probability that one student scores less than 85 is approximately 1 (or 100%).

The TI-instructions and answer are as follows:

normalcdf(0,85,63,5) = 1 (rounds to 1)



Problem 3.

Find the 90th percentile (that is, find the score k that has 90 % of the scores below k and 10% of the scores above k).

Solution

Find the 90th percentile. For each problem or part of a problem, draw a new graph. Draw the x-axis. Shade the area that corresponds to the 90th percentile.

Let k = the 90th percentile. k is located on the x-axis. P ( X < k ) is the area to the left of k . The 90th percentile k separates the exam scores into those that are the same or lower than k and those that are the same or higher. Ninety percent of the test scores are the same or lower than k and 10% are the same or higher. k is often called a critical value.

k = 69.4 (calculator or computer)

Normal distribution curve with values of 63 and x on the x-axis. The x-axis is equal to X. A vertical upward line extends from point x to the curve. The probability area, occurring from the beginning of the curve to point x, is equal to 0.90.

The 90th percentile is 69.4. This means that 90% of the test scores fall at or below 69.4 and 10% fall at or above. For the TI-83+ or TI-84 calculators, use invNorm in 2nd DISTR. invNorm(area to the left, mean, standard deviation) For this problem, invNorm(.90,63,5) = 69.4



Problem 4.

Find the 70th percentile (that is, find the score k such that 70% of scores are below k and 30% of the scores are above k).

Solution

Find the 70th percentile.

Draw a new graph and label it appropriately. k = 65.6

The 70th percentile is 65.6. This means that 70% of the test scores fall at or below 65.5 and 30% fall at or above.

invNorm(.70,63,5) = 65.6




Example 6.5. 

More and more households in the United States have at least one computer. The computer is used for office work at home, research, communication, personal finances, education, entertainment, social networking and a myriad of other things. Suppose that the average number of hours a household personal computer is used for entertainment is 2 hours per day. Assume the times for entertainment are normally distributed and the standard deviation for the times is half an hour.

Problem 1.

Find the probability that a household personal computer is used between 1.8 and 2.75 hours per day.

Solution

Let X = the amount of time (in hours) a household personal computer is used for entertainment. X ~ N ( 2 , 0.5 ) where μ = 2 and σ = 0.5.

Find P ( 1.8 < X < 2.75 ) .

The probability for which you are looking is the area between x = 1.8 and

Normal distribution curve with values 1.8, 2, and 2.75 on the x-axis. The x-axis is equal to X. Vertical upward lines extend upward from 1.8 and 2.75 to the curve.

normalcdf(1.8,2.75,2,.5) = 0.5886

The probability that a household personal computer is used between 1.8 and 2.75 hours per day for entertainment is 0.5886.



Problem 2.

Find the maximum number of hours per day that the bottom quartile of households use a personal computer for entertainment.

Solution

To find the maximum number of hours per day that the bottom quartile of households uses a personal computer for entertainment, find the 25th percentile, k , where P ( X < k ) = 0.25 .

Normal distribution curve with value k on the x-axis. The probability area from k to the end of the curve is equal to 0.75 and the rest of the area is equal to 0.25.

invNorm(.25,2,.5) = 1.67

The maximum number of hours per day that the bottom quartile of households uses a personal computer for entertainment is 1.67 hours.




6.6. Summary of Formulas*

Formula 6.1. Normal Probability Distribution

X ~ N ( μ , σ )

μ = the mean = the standard deviation


Formula 6.2. Standard Normal Probability Distribution

Z ~ N ( 0 , 1 )

Z = a standardized value (z-score)

mean = 0 standard deviation = 1


Formula 6.3. Finding the kth Percentile

To find the kth percentile when the z-score is known: k = μ + ( z ) σ


Formula 6.4. z-score


Formula 6.5. Finding the area to the left

The area to the left: P ( X < x )


Formula 6.6. Finding the area to the right

The area to the right: P ( X > x ) = 1 – P ( X < x )


6.7. Practice: The Normal Distribution*

Student Learning Outcomes

  • The student will explore the properties of data with a normal distribution.

Given

The life of Sunshine CD players is normally distributed with a mean of 4.1 years and a standard deviation of 1.3 years. A CD player is guaranteed for 3 years. We are interested in the length of time a CD player lasts.

Normal Distribution

Exercise 6.7.1.

Define the Random Variable X in words. X =


Exercise 6.7.2.

X ~


Exercise 6.7.3. (Go to Solution)

Find the probability that a CD player will break down during the guarantee period.

a. Sketch the situation. Label and scale the axes. Shade the region corresponding to the probability.

Figure 6.1. 

Empty normal distribution curve.

b. P ( 0 < X < _________ ) = _________

Exercise 6.7.4. (Go to Solution)

Find the probability that a CD player will last between 2.8 and 6 years.

a. Sketch the situation. Label and scale the axes. Shade the region corresponding to the probability.

Figure 6.2. 

Empty normal distribution curve.

b. P ( _______ < X < _______ ) = _________

Exercise 6.7.5. (Go to Solution)

Find the 70th percentile of the distribution for the time a CD player lasts.

a. Sketch the situation. Label and scale the axes. Shade the region corresponding to the lower 70%.

Figure 6.3. 

Empty normal distribution curve.

b. P(X < k) = _________. Therefore, k = __________.

Solutions to Exercises

Solution to Exercise 6.7.3. (Return to Exercise)

b. 3,0 . 1979

Solution to Exercise 6.7.4. (Return to Exercise)

b. 2 . 8,6,0 . 7694

Solution to Exercise 6.7.5. (Return to Exercise)

b. 0.70,4.78years

6.8. Homework*

Exercise 6.8.1. (Go to Solution)

According to a study done by De Anza students, the height for Asian adult males is normally distributed with an average of 66 inches and a standard deviation of 2.5 inches. Suppose one Asian adult male is randomly chosen. Let X = height of the individual.

a. X ~_______ (_______,_______)
b. Find the probability that the person is between 65 and 69 inches. Include a sketch of the graph and write a probability statement.
c. Would you expect to meet many Asian adult males over 72 inches? Explain why or why not, and justify your answer numerically.
d. The middle 40% of heights fall between what two values? Sketch the graph and write the probability statement.

Exercise 6.8.2.

IQ is normally distributed with a mean of 100 and a standard deviation of 15. Suppose one individual is randomly chosen. Let X = IQ of an individual.

a. X ~_______ (_______,_______)
b. Find the probability that the person has an IQ greater than 120. Include a sketch of the graph and write a probability statement.
c. Mensa is an organization whose members have the top 2% of all IQs. Find the minimum IQ needed to qualify for the Mensa organization. Sketch the graph and write the probability statement.
d. The middle 50% of IQs fall between what two values? Sketch the graph and write the probability statement.

Exercise 6.8.3. (Go to Solution)

The percent of fat calories that a person in America consumes each day is normally distributed with a mean of about 36 and a standard deviation of 10. Suppose that one individual is randomly chosen. Let X = percent of fat calories.

a. X ~_______ (_______,_______)
b. Find the probability that the percent of fat calories a person consumes is more than 40. Graph the situation. Shade in the area to be determined.
c. Find the maximum number for the lower quarter of percent of fat calories. Sketch the graph and write the probability statement.

Exercise 6.8.4.

Suppose that the distance of fly balls hit to the outfield (in baseball) is normally distributed with a mean of 250 feet and a standard deviation of 50 feet.

a. If X = distance in feet for a fly ball, then X ~_______ (_______,_______)
b. If one fly ball is randomly chosen from this distribution, what is the probability that this ball traveled fewer than 220 feet? Sketch the graph. Scale the horizontal axis X. Shade the region corresponding to the probability. Find the probability.
c. Find the 80th percentile of the distribution of fly balls. Sketch the graph and write the probability statement.

Exercise 6.8.5. (Go to Solution)

In China, 4-year-olds average 3 hours a day unsupervised. Most of the unsupervised children live in rural areas, considered safe. Suppose that the standard deviation is 1.5 hours and the amount of time spent alone is normally distributed. We randomly survey one Chinese 4-year-old living in a rural area. We are interested in the amount of time the child spends alone per day. (Source: San Jose Mercury News)

a. In words, define the random variable X . X =
b. X ~
c. Find the probability that the child spends less than 1 hour per day unsupervised. Sketch the graph and write the probability statement.
d. What percent of the children spend over 10 hours per day unsupervised?
e. 70% of the children spend at least how long per day unsupervised?

Exercise 6.8.6.

In the 1992 presidential election, Alaska’s 40 election districts averaged 1956.8 votes per district for President Clinton. The standard deviation was 572.3. (There are only 40 election districts in Alaska.) The distribution of the votes per district for President Clinton was bell-shaped. Let X = number of votes for President Clinton for an election district. (Source: The World Almanac and Book of Facts)

a. State the approximate distribution of X . X ~
b. Is 1956.8 a population mean or a sample mean? How do you know?
c. Find the probability that a randomly selected district had fewer than 1600 votes for President Clinton. Sketch the graph and write the probability statement.
d. Find the probability that a randomly selected district had between 1800 and 2000 votes for President Clinton.
e. Find the third quartile for votes for President Clinton.

Exercise 6.8.7. (Go to Solution)

Suppose that the duration of a particular type of criminal trial is known to be normally distributed with a mean of 21 days and a standard deviation of 7 days.

a. In words, define the random variable X . X =
b. X ~
c. If one of the trials is randomly chosen, find the probability that it lasted at least 24 days. Sketch the graph and write the probability statement.
d. 60% of all of these types of trials are completed within how many days?

Exercise 6.8.8.

Terri Vogel, an amateur motorcycle racer, averages 129.71 seconds per 2.5 mile lap (in a 7 lap race) with a standard deviation of 2.28 seconds . The distribution of her race times is normally distributed. We are interested in one of her randomly selected laps. (Source: log book of Terri Vogel)

a. In words, define the random variable X . X =
b. X ~
c. Find the percent of her laps that are completed in less than 130 seconds.
d. The fastest 3% of her laps are under _______ .
e. The middle 80% of her laps are from _______ seconds to _______ seconds.

Exercise 6.8.9. (Go to Solution)

Thuy Dau, Ngoc Bui, Sam Su, and Lan Voung conducted a survey as to how long customers at Lucky claimed to wait in the checkout line until their turn. Let X = time in line. Below are the ordered real data (in minutes):

Table 6.1.
0.504.25567.25
1.754.255.2567.25
24.255.256.257.25
2.254.255.56.257.75
2.254.55.56.58
2.54.755.56.58.25
2.754.755.756.59.5
3.254.755.756.759.5
3.75566.759.75
3.75566.7510.75
a. Calculate the sample mean and the sample standard deviation.
b. Construct a histogram. Start the at − 0.375 and make bar widths of 2 minutes.
c. Draw a smooth curve through the midpoints of the tops of the bars.
d. In words, describe the shape of your histogram and smooth curve.
e. Let the sample mean approximate μ and the sample standard deviation approximate σ . The distribution of X can then be approximated by X ~
f. Use the distribution in (e) to calculate the probability that a person will wait fewer than 6.1 minutes.
g. Determine the cumulative relative frequency for waiting less than 6.1 minutes.
h. Why aren’t the answers to (f) and (g) exactly the same?
i. Why are the answers to (f) and (g) as close as they are?
j. If only 10 customers were surveyed instead of 50, do you think the answers to (f) and (g) would have been closer together or farther apart? Explain your conclusion.

Exercise 6.8.10.

Suppose that Ricardo and Anita attend different colleges. Ricardo’s GPA is the same as the average GPA at his school. Anita’s GPA is 0.70 standard deviations above her school average. In complete sentences, explain why each of the following statements may be false.

a. Ricardo’s actual GPA is lower than Anita’s actual GPA.
b. Ricardo is not passing since his z-score is zero.
c. Anita is in the 70th percentile of students at her college.

Exercise 6.8.11. (Go to Solution)

Below is a sample of the maximum capacity (maximum number of spectators) of sports stadiums. The table does not include horse racing or motor racing stadiums. (Source: http://en.wikipedia.org/wiki/List_of_stadiums_by_capacity)

Table 6.2.
40,00040,00045,05045,50046,24948,134
49,13350,07150,09650,46650,83251,100
51,50051,90052,00052,13252,20052,530
52,69253,86454,00055,00055,00055,000
55,00055,00055,00055,08257,00058,008
59,68060,00060,00060,49260,58062,380
62,87264,03565,00065,05065,64766,000
66,16167,42868,34968,97669,37270,107
70,58571,59472,00072,92273,37974,500
75,02576,21278,00080,00080,00082,300
a. Calculate the sample mean and the sample standard deviation for the maximum capacity of sports stadiums (the data).
b. Construct a histogram of the data.
c. Draw a smooth curve through the midpoints of the tops of the bars of the histogram.
d. In words, describe the shape of your histogram and smooth curve.
e. Let the sample mean approximate μ and the sample standard deviation approximate σ . The distribution of X can then be approximated by X ~
f. Use the distribution in (e) to calculate the probability that the maximum capacity of sports stadiums is less than 67,000 spectators.
g. Determine the cumulative relative frequency that the maximum capacity of sports stadiums is less than 67,000 spectators. Hint: Order the data and count the sports stadiums that have a maximum capacity less than 67,000. Divide by the total number of sports stadiums in the sample.
h. Why aren’t the answers to (f) and (g) exactly the same?

Try These Multiple Choice Questions

The questions below refer to the following: The patient recovery time from a particular surgical procedure is normally distributed with a mean of 5.3 days and a standard deviation of 2.1 days.

Exercise 6.8.12. (Go to Solution)

What is the median recovery time?

A. 2.7
B. 5.3
C. 7.4
D. 2.1

Exercise 6.8.13. (Go to Solution)

What is the z-score for a patient who takes 10 days to recover?

A. 1.5
B. 0.2
C. 2.2
D. 7.3

Exercise 6.8.14. (Go to Solution)

What is the probability of spending more than 2 days in recovery?

A. 0.0580
B. 0.8447
C. 0.0553
D. 0.9420

Exercise 6.8.15. (Go to Solution)

The 90th percentile for recovery times is?

A. 8.89
B. 7.07
C. 7.99
D. 4.32

The questions below refer to the following: The length of time to find a parking space at 9 A.M. follows a normal distribution with a mean of 5 minutes and a standard deviation of 2 minutes.

Exercise 6.8.16. (Go to Solution)

Based upon the above information and numerically justified, would you be surprised if it took less than 1 minute to find a parking space?

A. Yes
B. No
C. Unable to determine

Exercise 6.8.17. (Go to Solution)

Find the probability that it takes at least 8 minutes to find a parking space.

A. 0.0001
B. 0.9270
C. 0.1862
D. 0.0668

Exercise 6.8.18. (Go to Solution)

Seventy percent of the time, it takes more than how many minutes to find a parking space?

A. 1.24
B. 2.41
C. 3.95
D. 6.05

Exercise 6.8.19. (Go to Solution)

If the mean is significantly greater than the standard deviation, which of the following statements is true?

I . The data cannot follow the uniform distribution.
II . The data cannot follow the exponential distribution..
III . The data cannot follow the normal distribution.

A. I only
B. II only
C. III only
D. I, II, and III

Solutions to Exercises

Solution to Exercise 6.8.1. (Return to Exercise)

a. N ( 66 , 2.5 )
b. 0.5404
c. No
d. Between 64.7 and 67.3 inches

Solution to Exercise 6.8.3. (Return to Exercise)

a. N ( 36 , 10 )
b. 0.3446
c. 29.3

Solution to Exercise 6.8.5. (Return to Exercise)

a. the time (in hours) a 4-year-old in China spends unsupervised per day
b. N ( 3,1 . 5 )
c. 0.0912
d. 0
e. 2.21 hours

Solution to Exercise 6.8.7. (Return to Exercise)

a. The duration of a criminal trial
b. N ( 21 , 7 )
c. 0.3341
d. 22.77

Solution to Exercise 6.8.9. (Return to Exercise)

a. The sample mean is 5.51 and the sample standard deviation is 2.15
e. N ( 5 . 51 , 2 . 15 )
f. 0.6081
g. 0.64

Solution to Exercise 6.8.11. (Return to Exercise)

a. The sample mean is 60,136.4 and the sample standard deviation is 10,468.1.
e. N ( 60136 . 4 , 10468 . 1 )
f. 0.7440
g. 0.7167

Solution to Exercise 6.8.12. (Return to Exercise)

 B


Solution to Exercise 6.8.13. (Return to Exercise)

 C


Solution to Exercise 6.8.14. (Return to Exercise)

 D


Solution to Exercise 6.8.15. (Return to Exercise)

 C


Solution to Exercise 6.8.16. (Return to Exercise)

A


Solution to Exercise 6.8.17. (Return to Exercise)

 D


Solution to Exercise 6.8.18. (Return to Exercise)

 C


Solution to Exercise 6.8.19. (Return to Exercise)

B


6.9. Review*

The next two questions refer to: X ~ U ( 3 , 13 )

Exercise 6.9.1. (Go to Solution)

Explain which of the following are false and which are true.

a: , 3 ≤ x ≤ 13
b: There is no mode.
c: The median is less than the mean.
d: P ( X > 10 ) = P ( X ≤ 6 )

Exercise 6.9.2. (Go to Solution)

 Calculate:

a: Mean
b: Median
c: 65th percentile.
Horizontal boxplot with first whisker at 0 to 2, box from 2 to 5, line at 4, and second whisker from 5 to 7.

Exercise 6.9.3. (Go to Solution)

Which of the following is true for the above box plot?

a: 25% of the data are at most 5.
b: There is about the same amount of data from 4 – 5 as there is from 5 – 7.
c: There are no data values of 3.
d: 50% of the data are 4.

Exercise 6.9.4. (Go to Solution)

If P(GH) = P(G), then which of the following is correct?

A: G and H are mutually exclusive events.
B: P ( G ) = P ( H )
C: Knowing that H has occurred will affect the chance that G will happen.
D: G and H are independent events.

Exercise 6.9.5. (Go to Solution)

If P(J) = 0.3, P(K) = 0.6, and J and K are independent events, then explain which are correct and which are incorrect.

A: P( J and K) = 0
B: P( J or K) = 0.9
C: P( J or K) = 0.72
D: P ( J ) ≠ P (JK)

Exercise 6.9.6. (Go to Solution)

On average, 5 students from each high school class get full scholarships to 4-year colleges. Assume that most high school classes have about 500 students.

X = the number of students from a high school class that get full scholarships to 4-year school. Which of the following is the distribution of X ?

A. P(5)
B. B(500,5)
C. Exp(1/5)
D. N(5, (0.01)(0.99)/500)

Solutions to Exercises

Solution to Exercise 6.9.1. (Return to Exercise)

a: True
b: True
c: False – the median and the mean are the same for this symmetric distribution
d: True

Solution to Exercise 6.9.2. (Return to Exercise)

a: 8
b: 8
c: . k = 9.5

Solution to Exercise 6.9.3. (Return to Exercise)

a: False – of the data are at most 5
b: True – each quartile has 25% of the data
c: False – that is unknown
d: False – 50% of the data are 4 or less

Solution to Exercise 6.9.4. (Return to Exercise)

D


Solution to Exercise 6.9.5. (Return to Exercise)

A: False - J and K are independent so they are not mutually exclusive which would imply dependency (meaning P(J and K) is not 0).
B: False - see answer C.
C: True - P(J or K) = P(J) + P(K) - P(J and K) = P(J) + P(K) - P(J)P(K) = 0.3 + 0.6 - (0.3)(0.6) = 0.72. Note that P(J and K) = P(J)P(K) because J and K are independent.
D: False - J and K are independent so P(J) = P(J|K).

Solution to Exercise 6.9.6. (Return to Exercise)

A


6.10. Lab 1: Normal Distribution (Lap Times)*

Class Time:

Names:

Student Learning Outcome:

  • The student will compare and contrast empirical data and a theoretical distribution to determine if Terry Vogel’s lap times fit a continuous distribution.

Directions:

Round the relative frequencies and probabilities to 4 decimal places. Carry all other decimal answers to 2 places.

Collect the Data

  1. Use the data from Terri Vogel’s Log Book. Use a Stratified Sampling Method by Lap (Races 1 – 20) and a random number generator to pick 6 lap times from each stratum. Record the lap times below for Laps 2 – 7.

    Table 6.3.
    __________________________________________
    __________________________________________
    __________________________________________
    __________________________________________
    __________________________________________
    __________________________________________

  2. Construct a histogram. Make 5 - 6 intervals. Sketch the graph using a ruler and pencil. Scale the axes.

    Figure 6.4. 

    Blank graph with relative frequency on the vertical axis and lap time on the horizontal axis.


  3. Calculate the following.

    a.
    b. s =

  4. Draw a smooth curve through the tops of the bars of the histogram. Use 1 – 2 complete sentences to describe the general shape of the curve. (Keep it simple. Does the graph go straight across, does it have a V-shape, does it have a hump in the middle or at either end, etc.?)

Analyze the Distribution

Using your sample mean, sample standard deviation, and histogram to help, what was the approximate theoretical distribution of the data?

  • X  ~

  • How does the histogram help you arrive at the approximate distribution?

Describe the Data

Use the Data from the section titled “Collect the Data” to complete the following statements.

  • The IQR goes from __________ to __________.

  • IQR = __________. (IQR=Q3-Q1)

  • The 15th percentile is:

  • The 85th percentile is:

  • The median is:

  • The empirical probability that a randomly chosen lap time is more than 130 seconds =

  • Explain the meaning of the 85th percentile of this data.

Theoretical Distribution

Using the theoretical distribution from the section titled “Analyse the Distribution” complete the following statements:

  • The IQR goes from __________ to __________.

  • IQR =

  • The 15th percentile is:

  • The 85th percentile is:

  • The median is:

  • The probability that a randomly chosen lap time is more than 130 seconds =

  • Explain the meaning of the 85th percentile of this distribution.

Discussion Questions

  • Do the data from the section titled “Collect the Data” give a close approximation to the theoretical distibution in the section titled “Analyze the Distribution”? In complete sentences and comparing the result in the sections titled “Describe the Data” and “Theoretical Distribution”, explain why or why not.

6.11. Lab 2: Normal Distribution (Pinkie Length)*

Class Time:

Names:

Student Learning Outcomes:

  • The student will compare empirical data and a theoretical distribution to determine if an everyday experiment fits a continuous distribution.

Collect the Data

Measure the length of your pinkie finger (in cm.)

  1. Randomly survey 30 adults. Round to the nearest 0.5 cm.

    Table 6.4.
    ___________________________________
    ___________________________________
    ___________________________________
    ___________________________________
    ___________________________________
    ___________________________________

  2. Construct a histogram. Make 5-6 intervals. Sketch the graph using a ruler and pencil. Scale the axes.

    Blank graph with frequency on the vertical axis and length of finger on the horizontal axis.

  3. Calculate the Following

    a.
    b. s =

  4. Draw a smooth curve through the top of the bars of the histogram. Use 1-2 complete sentences to describe the general shape of the curve. (Keep it simple. Does the graph go straight across, does it have a V-shape, does it have a hump in the middle or at either end, etc.?)

Analyze the Distribution

Using your sample mean, sample standard deviation, and histogram to help, what was the approximate theoretical distribution of the data from the section titled “Collect the Data”?

  • X  ~

  • How does the histogram help you arrive at the approximate distribution?

Describe the Data

Using the data in the section titled “Collect the Data” complete the following statements. (Hint: order the data)

Remember

( IQR = Q 3 – Q 1 )
  • IQR =

  • 15th percentile is:

  • 85th percentile is:

  • Median is:

  • What is the empirical probability that a randomly chosen pinkie length is more than 6.5 cm?

  • Explain the meaning the 85th percentile of this data.

Theoretical Distribution

Using the Theoretical Distribution in the section titled “Analyze the Distribution”

  • IQR =

  • 15th percentile is:

  • 85th percentile is:

  • Median is:

  • What is the theoretical probability that a randomly chosen pinkie length is more than 6.5 cm?

  • Explain the meaning of the 85th percentile of this data.

Discussion Questions

  • Do the data from the section entitled “Collect the Data” give a close approximation to the theoretical distribution in “Analyze the Distribution.” In complete sentences and comparing the results in the sections titled “Describe the Data” and “Theoretical Distribution”, explain why or why not.