Chapter 5. Continuous Random Variables

5.1. Continuous Random Variables*

Student Learning Objectives

By the end of this chapter, the student should be able to:

  • Recognize and understand continuous probability density functions in general.

  • Recognize the uniform probability distribution and apply it appropriately.

  • Recognize the exponential probability distribution and apply it appropriately.

Introduction

Continuous random variables have many applications. Baseball batting averages, IQ scores, the length of time a long distance telephone call lasts, the amount of money a person carries, the length of time a computer chip lasts, and SAT scores are just a few. The field of reliability depends on a variety of continuous random variables.

This chapter gives an introduction to continuous random variables and the many continuous distributions. We will be studying these continuous distributions for several chapters.

NOTE

The values of discrete and continuous random variables can be ambiguous. For example, if X is equal to the number of miles (to the nearest mile) you drive to work, then X is a discrete random variable. You count the miles. If X is the distance you drive to work, then you measure values of X and X is a continuous random variable. How the random variable is defined is very important.

Properties of Continuous Probability Distributions

The graph of a continuous probability distribution is a curve. Probability is represented by area under the curve.

The curve is called the probability density function (abbreviated: pdf). We use the symbol f(x) to represent the curve. f(x) is the function that corresponds to the graph; we use the density function f(x) to draw the graph of the probability distribution.

Area under the curve is given by a different function called the cumulative distribution function (abbreviated: cdf). The cumulative distribution function is used to evaluate probability as area.

  • The outcomes are measured, not counted.

  • The entire area under the curve and above the x-axis is equal to 1.

  • Probability is found for intervals of X values rather than for individual X values.

  • P(c < X < d) is the probability that the random variable X is in the interval between the values c and d. P(c < X < d) is the area under the curve, above the x-axis, to the right of c and the left of d.

  • P(X = c) = 0 The probability that X takes on any single individual value is 0. The area below the curve, above the x-axis, and between X=c and X=c has no width, and therefore no area (area = 0). Since the probability is equal to the area, the probability is also 0.

We will find the area that represents probability by using geometry, formulas, technology, or probability tables. In general, calculus is needed to find the area under the curve for many probability density functions. When we use formulas to find the area in this textbook, the formulas were found by using the techniques of integral calculus. However, because most students taking this course have not studied calculus, we will not be using calculus in this textbook.

There are many continuous probability distributions. When using a continuous probability distribution to model probability, the distribution used is selected to best model and fit the particular situation.

In this chapter and the next chapter, we will study the uniform distribution, the exponential distribution, and the normal distribution. The following graphs illustrate these distributions.

Figure 5.1. 

Figure (graphics1.png)
The graph shows a Uniform Distribution with the area between X=3 and X=6 shaded to represent the probability that the value of the random variable X is in the interval between 3 and 6.


Figure 5.2. 

Figure (graphics2.png)
The graph shows an Exponential Distribution with the area between X=2 and X=4 shaded to represent the probability that the value of the random variable X is in the interval between 2 and 4.


Figure 5.3. 

Figure (graphics3.png)
The graph shows the Standard Normal Distribution with the area between X=1 and X=2 shaded to represent the probability that the value of the random variable X is in the interval between 1 and 2.


**With contributions from Roberta Bloom

Glossary

Uniform Distribution

A continuous random variable (RV) that has equally likely outcomes over the domain, a < x < b . Often referred as the Rectangular distribution because the graph of the pdf has the form of a rectangle. Notation: X~U(a , b). The mean is and the standard deviation is The probability density function is for a < X < b or aXb . The cumulative distribution is .

Exponential Distribution

A continuous random variable (RV) that appears when we are interested in the intervals of time between some random events, for example, the length of time between emergency arrivals at a hospital. Notation: . The mean is and the standard deviation is . The probability density function is x ≥ 0 and the cumulative distribution function is .

5.2. Continuous Probability Functions*

We begin by defining a continuous probability density function. We use the function notation f(X). Intermediate algebra may have been your first formal introduction to functions. In the study of probability, the functions we study are special. We define the function f(X) so that the area between it and the x-axis is equal to a probability. Since the maximum probability is one, the maximum area is also one.

For continuous probability distributions, PROBABILITYAREA.

Example 5.1. 

Consider the function for 0X ≤ 20. X = a real number. The graph of is a horizontal line. However, since 0X ≤ 20 , f(X) is restricted to the portion between X = 0 and X = 20, inclusive .

f(X)=1/20 graph displaying a boxed region consisting of a horizontal line extending to the right from point 1/20 on the y-axis, a vertical upward line from point 20 on the x-axis, and the x and y-axes.

for 0 ≤ X ≤ 20 .

The graph of is a horizontal line segment when 0X ≤ 20.

The area between where 0X ≤ 20. and the x-axis is the area of a rectangle with base = 20 and height =.

This particular function, where we have restricted X so that the area between the function and the x-axis is 1, is an example of a continuous probability density function. It is used as a tool to calculate probabilities.

Suppose we want to find the area between and the x-axis where 0 < X < 2 .

f(X)=1/20 graph displaying a boxed region consisting of a horizontal line extending to the right from point 1/20 on the y-axis, a vertical upward line from point 20 on the x-axis, and the x and y-axes. A shaded region ranging from points 0-2 on the x-axis occurs within this area.

(2 – 0) = 2 = base of a rectangle

= the height.

The area corresponds to a probability. The probability that X is between 0 and 2 is 0.1, which can be written mathematically as P(0<X<2) = P(X<2) = 0.1.

Suppose we want to find the area between and the x-axis where 4 < X < 15 .

f(X)=1/20 graph displaying a boxed region consisting of a horizontal line extending to the right from point 1/20 on the y-axis, a vertical upward line from point 20 on the x-axis, and the x and y-axes. A shaded region ranging from points 4-15 on the x-axis occurs within this area.

(15 – 4) = 11 = the base of a rectangle

= the height.

The area corresponds to the probability P(4 < X < 15) = 0.55.

Suppose we want to find P(X = 15). On an x-y graph, X = 15 is a vertical line. A vertical line has no width (or 0 width). Therefore, .

f(X)=1/20 graph displaying a boxed region consisting of a horizontal line extending to the right from point 1/20 on the y-axis, a vertical upward line from point 20 on the x-axis, and the x and y-axes. A vertical upward line is drawn from point 15 on the x-axis to the horizontal line occurring from point 1/20 on the y-axis.

P(Xx) (can be written as P(X < x) for continuous distributions) is called the cumulative distribution function or CDF. Notice the “less than or equal to” symbol. We can use the CDF to calculate P(X > x) . The CDF gives “area to the left” and P(X > x) gives “area to the right.” We calculate P(X > x) for continuous distributions as follows: P(X > x) = 1 – P(X < x) .

f(X) graph displaying a boxed region consisting of a horizontal line extending to the right from midway on the y-axis, a vertical upward line from an arbitrary point on the x-axis, and the x and y-axes. A shaded region from points 0-x occurs within this area.

Label the graph with f(X) and X . Scale the x and y axes with the maximum x and y values. , 0X ≤ 20.

f(X) graph displaying a boxed region consisting of a horizontal line extending to the right from midway on the y-axis, a vertical upward line from an arbitrary point on the x-axis, and the x and y-axes. A shaded region from points 2.3-12.7 occurs within this area.


5.3. The Uniform Distribution*

Example 5.2. 

The previous problem is an example of the uniform probability distribution.

Illustrate the uniform distribution. The data that follows are 55 smiling times, in seconds, of an eight-week old baby.

Table 5.1.
10.419.618.813.917.816.821.617.912.511.14.9
12.814.822.820.015.916.313.417.114.519.022.8
1.30.78.911.910.97.35.93.717.919.29.8
5.86.92.65.821.711.83.42.14.56.310.7
8.99.49.47.610.03.36.77.811.613.818.6

sample mean = 11.49 and sample standard deviation = 6.23

We will assume that the smiling times, in seconds, follow a uniform distribution between 0 and 23 seconds, inclusive. This means that any smiling time from 0 to and including 23 seconds is equally likely. The histogram that could be constructed from the sample is an empirical distribution that closely matches the theoretical uniform distribution.

Let X = length, in seconds, of an eight-week old baby’s smile.

The notation for the uniform distribution is

X ~ U(a, b) where a = the lowest value of X and b = the highest value of X .

The probability density function is for aXb .

For this example, X ~ U(0, 23) and for 0 ≤ X ≤ 23.

Formulas for the theoretical mean and standard deviation are

and

For this problem, the theoretical mean and standard deviation are

seconds and  seconds

Notice that the theoretical mean and standard deviation are close to the sample mean and standard deviation.

Example 5.3. 

Problem 1.

What is the probability that a randomly chosen eight-week old baby smiles between 2 and 18 seconds?

Solution

Find P(2 < X < 18) .

.

f(X) graph displaying a boxed region consisting of a horizontal line extending to the right from midway on the y-axis, a vertical upward line from point 23 on the x-axis, and the x and y-axes. A shaded region from points 2-18 occurs within this area.



Problem 2.

Find the 90th percentile for an eight week old baby’s smiling time.

Solution

Ninety percent of the smiling times fall below the 90th percentile, k , so P(X < k) = 0.90

P(X < k) = 0.90

(base)(height) = 0.90

k = 23⋅0.90 = 20.7

f(X)=1/23 graph displaying a boxed region consisting of a horizontal line extending to the right from point 1/23 on the y-axis, a vertical upward line from point 23 on the x-axis, and the x and y-axes. A shaded region from points 0-k occurs within this area. The shaded region probability area is equal to 0.90.



Problem 3.

Find the probability that a random eight week old baby smiles more than 12 seconds KNOWING that the baby smiles MORE THAN 8 SECONDS.

Solution

Find P(X > 12|X > 8) There are two ways to do the problem. For the first way, use the fact that this is a conditional and changes the sample space. The graph illustrates the new sample space. You already know the baby smiled more than 8 seconds.

Write a new f(X):

for 8 < X < 23

f(X)=1/15 graph displaying a boxed region consisting of a horizontal line extending to the right from point 1/15 on the y-axis, a vertical upward line from points 8 and 23 on the x-axis, and the x-axis. A shaded region from points 12-23 occurs within this area.

For the second way, use the conditional formula from Probability Topics with the original distribution X ~ U(0,23):

For this problem, A is ( X > 12 ) and B is ( X > 8 ) .

So,

f(X)=1/23 graph displaying a conditional boxed region consisting of a horizontal red line extending to the right from point 1/23 on the y-axis, a vertical red upward line from point 23 on the x-axis, and the x and y-axes. Two vertical upward lines from points 8 and 12 on the x-axis occur within this area.





Example 5.4. 

Uniform: The amount of time, in minutes, that a person must wait for a bus is uniformly distributed between 0 and 15 minutes, inclusive.

Problem 1.

What is the probability that a person waits fewer than 12.5 minutes?

Solution

Let X = the number of minutes a person must wait for a bus. a = 0 and b = 15. X ~ U(0,15). Write the probability density function. for 0 ≤ X ≤ 15.

Find P(X < 12.5). Draw a graph.

The probability a person waits less than 12.5 minutes is 0.8333.

f(X)=1/15 graph displaying a boxed region consisting of a horizontal line extending to the right from point 1/15 on the y-axis, a vertical upward line from point 15 on the x-axis, and the x and y-axes. A shaded region from points 0-12.5 occurs within this area.



Problem 2.

On the average, how long must a person wait?

Find the mean, μ , and the standard deviation, σ .

Solution

. On the average, a person must wait 7.5 minutes.

. The Standard deviation is 4.3 minutes.



Problem 3.

Ninety percent of the time, the time a person must wait falls below what value?

Note

This asks for the 90th percentile.

Solution

Find the 90th percentile. Draw a graph. Let k = the 90th percentile.

k = (0.90)(15) = 13.5

k is sometimes called a critical value.

The 90th percentile is 13.5 minutes. Ninety percent of the time, a person must wait at most 13.5 minutes.

f(X)=1/15 graph displaying a boxed region consisting of a horizontal line extending to the right from point 1/15 on the y-axis, a vertical upward line from an arbitrary point on the x-axis, and the x and y-axes. A shaded region from points 0-k occurs within this area. The area of this probability region is equal to 0.90.




Example 5.5. 

Uniform: The average number of donuts a nine-year old child eats per month is uniformly distributed from 0.5 to 4 donuts, inclusive. Let X = the average number of donuts a nine-year old child eats per month. Then X ~ U(0.5, 4).

Problem 1. (Go to Solution)

The probability that a randomly selected nine-year old child eats an average of more than two donuts is _______.


Problem 2. (Go to Solution)

Find the probability that a different nine-year old child eats an average of more than two donuts given that his or her amount is more than 1.5 donuts.

The second probability question has a conditional (refer to “Probability Topics“). You are asked to find the probability that a nine-year old eats an average of more than two donuts given that his/her amount is more than 1.5 donuts. Solve the problem two different ways (see the first example). You must reduce the sample space. First way: Since you already know the child eats more than 1.5 donuts, you are no longer starting at a = 0.5 donut. Your starting point is 1.5 donuts.

Write a new f(X):

for 1.5 ≤ X ≤ 4.

Find P(X > 2|X > 1.5) . Draw a graph.

f(X)=2/5 graph displaying a boxed region consisting of a horizontal line extending to the right from point 2/5 on the y-axis, a vertical upward line from points 1.5 and 4 on the x-axis, and the x-axis. A shaded region from points 2-4 occurs within this area.

P(X > 2|X > 1.5) = (base)(new height) = (4 – 2) (2 / 5) = ?


The probability that a nine-year old child eats an average of more than 2 donuts when he/she has already eaten more than 1.5 donuts is .

Second way: Draw the original graph for X ~ U(0.5, 4). Use the conditional formula

Example 5.6. 

Uniform: Ace Heating and Air Conditioning Service finds that the amount of time a repairman needs to fix a furnace is uniformly distributed between 1.5 and 4 hours. Let X = the time needed to fix a furnace. Then X ~ U(1.5, 4).

  1. Find the problem that a randomly selected furnace repair requires more than 2 hours.

  2. Find the probability that a randomly selected furnace repair requires less than 3 hours.

  3. Find the 30th percentile of furnace repair times.

  4. The longest 25% of repair furnace repairs take at least how long? (In other words: Find the minimum time for the longest 25% of repair times.) What percentile does this represent?

  5. Find the mean and standard deviation

Problem 1.

Find the probability that a randomly selected furnace repair requires longer than 2 hours.

Solution

To find f(X): so f(X) = 0.4

P(X>2) = (base)(height) = (4 − 2)(0.4) = 0.8

Figure 5.4. Example 4 Figure 1

Example 4 Figure 1 (UniformEx4Graph1.JPG)
Uniform Distribution between 1.5 and 4 with shaded area between 2 and 4 representing the probability that the repair time X is greater than 2




Problem 2.

Find the probability that a randomly selected furnace repair requires less than 3 hours. Describe how the graph differs from the graph in the first part of this example.

Solution

P(X < 3) = (base)(height) = (3 − 1.5)(0.4) = 0.6

The graph of the rectangle showing the entire distribution would remain the same. However the graph should be shaded between X=1.5 and X=3. Note that the shaded area starts at X=1.5 rather than at X=0; since X~U(1.5,4), X can not be less than 1.5.

Figure 5.5. Example 4 Figure 2

Example 4 Figure 2 (UniformEx4Graph2.JPG)
Uniform Distribution between 1.5 and 4 with shaded area between 1.5 and 3 representing the probability that the repair time X is less than 3




Problem 3.

Find the 30th percentile of furnace repair times.

Solution

Figure 5.6. Example 4 Figure 3

Example 4 Figure 3 (UniformEx4Graph3.JPG)
Uniform Distribution between 1.5 and 4 with an area of 0.30 shaded to the left, representing the shortest 30% of repair times.


P(X < k) = 0.30

P(X < k) = (base)(height) = (k – 1.5) ⋅(0.4)

0.3 = (k − 1.5) (0.4) ; Solve to find k:
0.75 = k − 1.5 , obtained by dividing both sides by 0.4
k = 2.25 , obtained by adding 1.5 to both sides

The 30th percentile of repair times is 2.25 hours. 30% of repair times are 2.5 hours or less.



Problem 4.

The longest 25% of furnace repair times take at least how long? (Find the minimum time for the longest 25% of repairs.)

Solution

Figure 5.7. Example 4 Figure 4

Example 4 Figure 4 (UniformEx4Graph4.JPG)
Uniform Distribution between 1.5 and 4 with an area of 0.25 shaded to the right representing the longest 25% of repair times.


P(X > k) = 0.25

P(X > k) = (base)(height) = (4 – k) ⋅(0.4)

0.25 = (4 − k)(0.4) ; Solve for k:
0.625 = 4 − k , obtained by dividing both sides by 0.4
−3.375 = −k , obtained by subtracting 4 from both sides
k=3.375

The longest 25% of furnace repairs take at least 3.375 hours (3.375 hours or longer).

Note: Since 25% of repair times are 3.375 hours or longer, that means that 75% of repair times are 3.375 hours or less. 3.375 hours is the 75th percentile of furnace repair times.



Problem 5.

Find the mean and standard deviation

Solution

and

hours and  hours





**Example 5 contributed by Roberta Bloom

Solutions to Exercises

Solution to Exercise 1. (Return to Problem)

0.5714


Solution to Exercise 2. (Return to Problem)


Glossary

Conditional Probability

The likelihood that an event will occur given that another event has already occurred.

Uniform Distribution

A continuous random variable (RV) that has equally likely outcomes over the domain, a < x < b . Often referred as the Rectangular distribution because the graph of the pdf has the form of a rectangle. Notation: X~U(a , b). The mean is and the standard deviation is The probability density function is for a < X < b or aXb . The cumulative distribution is .

5.4. The Exponential Distribution*

The exponential distribution is often concerned with the amount of time until some specific event occurs. For example, the amount of time (beginning now) until an earthquake occurs has an exponential distribution. Other examples include the length, in minutes, of long distance business telephone calls, and the amount of time, in months, a car battery lasts. It can be shown, too, that the amount of change that you have in your pocket or purse follows an exponential distribution.

Values for an exponential random variable occur in the following way. There are fewer large values and more small values. For example, the amount of money customers spend in one trip to the supermarket follows an exponential distribution. There are more people that spend less money and fewer people that spend large amounts of money.

The exponential distribution is widely used in the field of reliability. Reliability deals with the amount of time a product lasts.

Example 5.7. 

Illustrates the exponential distribution: Let X = amount of time (in minutes) a postal clerk spends with his/her customer. The time is known to have an exponential distribution with the average amount of time equal to 4 minutes.

X is a continuous random variable since time is measured. It is given that μ = 4 minutes. To do any calculations, you must know m , the decay parameter.

. Therefore,

The standard deviation, σ , is the same as the mean. μ = σ

The distribution notation is . Therefore, .

The probability density function is The number e = 2.71828182846… It is a number that is used often in mathematics. Scientific calculators have the key “ e x .” If you enter 1 for x , the calculator will display the value e .

The curve is:

where X is at least 0 and m = 0.25.

For example,

The graph is as follows:

Exponential graph with increments of 2 from 0-20 on the x-axis of μ = 4 and increments of 0.05 from 0.05-0.25 on the y-axis of m = 0.25. The curved line begins at the top at point (0, 0.25) and curves down to point (20, 0). The x-axis is equal to a continuous random variable.

Notice the graph is a declining curve. When X = 0,

Example 5.8. 

Problem 1.

Find the probability that a clerk spends four to five minutes with a randomly selected customer.

Solution

Find P(4 < X < 5) .

The cumulative distribution function (CDF) gives the area to the left.

P ( X < x ) = 1 – e -m⋅x

P ( X < 5 ) = 1 – e -0.25⋅5 = 0.7135 and P ( X < 4 ) = 1 – e -0.25⋅4 = 0.6321

Exponential graph with the curved line beginning at point (0, 0.25) and curves down towards point (∞, 0). Two vertical upward lines extend from points 4 and 5 to the curved line. The probability is in the area between points 4 and 5.

Note

You can do these calculations easily on a calculator.

The probability that a postal clerk spends four to five minutes with a randomly selected customer is

P(4 < X < 5) = P(X < 5) – P(X < 4) = 0.7135 − 0.6321 = 0.0814

Note

TI-83+ and TI-84: On the home screen, enter (1-e^(-.25*5))-(1-e^(-.25*4)) or enter e^(-.25*4)-e^(-.25*5).



Problem 2.

Half of all customers are finished within how long? (Find the 50th percentile)

Solution

Find the 50th percentile.

Exponential graph with the curved line beginning at point (0, 0.25) and curves down towards point (∞, 0). A vertical upward line extends from point k to the curved line. The probability area from 0-k is equal to 0.50.

P(X < k) = 0.50 , k = 2.8 minutes (calculator or computer)

Half of all customers are finished within 2.8 minutes.

You can also do the calculation as follows:

P(X < k) = 0.50 and P ( X < k ) = 1 – e -0.25⋅k

Therefore, 0.50 = 1 − e −0.25⋅k and e −0.25⋅k = 1 − 0.50 = 0.5

Take natural logs: . So, −0.25⋅k = ln ( 0.50 )

Solve for k :  minutes

Note

A formula for the percentile k is where LN is the natural log.

Note

TI-83+ and TI-84: On the home screen, enter LN(1-.50)/-.25. Press the (-) for the negative.



Problem 3.

Which is larger, the mean or the median?

Solution

Is the mean or median larger?

From part b, the median or 50th percentile is 2.8 minutes. The theoretical mean is 4 minutes. The mean is larger.





Optional Collaborative Classroom Activity

Have each class member count the change he/she has in his/her pocket or purse. Your instructor will record the amounts in dollars and cents. Construct a histogram of the data taken by the class. Use 5 intervals. Draw a smooth curve through the bars. The graph should look approximately exponential. Then calculate the mean.

Let X = the amount of money a student in your class has in his/her pocket or purse.

The distribution for X is approximately exponential with mean, μ = _______ and m = _______. The standard deviation, σ = ________.

Draw the appropriate exponential graph. You should label the x and y axes, the decay rate, and the mean. Shade the area that represents the probability that one student has less than $.40 in his/her pocket or purse. (Shade P(X < 0.40) ).

Example 5.9. 

On the average, a certain computer part lasts 10 years. The length of time the computer part lasts is exponentially distributed.

Problem 1.

What is the probability that a computer part lasts more than 7 years?

Solution

Let X = the amount of time (in years) a computer part lasts.

μ = 10 so

Find P(X > 7). Draw a graph.

P(X > 7) = 1 – P(X < 7).

Since P ( X < x ) = 1 – e -mx then

P ( X > 7 ) = e -0.1⋅7 = 0.4966 . The probability that a computer part lasts more than 7 years is 0.4966.

Note

TI-83+ and TI-84: On the home screen, enter e^(-.1*7).

Exponential graph with the curved line beginning at point (0, 0.1) and curves down towards point (∞, 0). A vertical upward line extends from point 1 to the curved line. The probability area occurs from point 1 to the end of the curve. The x-axis is equal to the amount of time a computer part lasts.



Problem 2.

On the average, how long would 5 computer parts last if they are used one after another?

Solution

On the average, 1 computer part lasts 10 years. Therefore, 5 computer parts, if they are used one right after the other would last, on the average,

(5) (10) = 50  years.



Problem 3.

Eighty percent of computer parts last at most how long?

Solution

Find the 80th percentile. Draw a graph. Let k = the 80th percentile.

Exponential graph with the curved line beginning at point (0, 0.1) and curves down towards point (∞, 0). A vertical upward line extends from point k to the curved line. k is the 80th percentile. The probability area from 0-k is equal to 0.80.

Solve for k :  years

Eighty percent of the computer parts last at most 16.1 years.

Note

TI-83+ and TI-84: On the home screen, enter LN(1 - .80)/-.1



Problem 4.

What is the probability that a computer part lasts between 9 and 11 years?

Solution

Find P(9 < X < 11) . Draw a graph.

Exponential graph with the curved line beginning at point (0, 0.1) and curves down towards point (∞, 0). Two vertical upward lines extend from point 9 and 11 to the curved line. The probability area occurs between point 9 and 11.

. (calculator or computer)

The probability that a computer part lasts between 9 and 11 years is 0.0737.

Note

TI-83+ and TI-84: On the home screen, enter e^(-.1*9) - e^(-.1*11).




Example 5.10. 

Suppose that the length of a phone call, in minutes, is an exponential random variable with decay parameter = . If another person arrives at a public telephone just before you, find the probability that you will have to wait more than 5 minutes. Let X = the length of a phone call, in minutes.

Problem (Go to Solution)

What is m , μ , and σ ? The probability that you must wait more than 5 minutes is _______ .


Note

A summary for exponential distribution is available in “Summary of The Uniform and Exponential Probability Distributions“.


Solutions to Exercises

Solution to Exercise (Return to Problem)

  • m =

  • μ = 12

  • σ = 12

P ( X  >  5 )  =  0.6592


Glossary

Exponential Distribution

A continuous random variable (RV) that appears when we are interested in the intervals of time between some random events, for example, the length of time between emergency arrivals at a hospital. Notation: . The mean is and the standard deviation is . The probability density function is x ≥ 0 and the cumulative distribution function is .

5.5. Summary of the Uniform and Exponential Probability Distributions*

Formula 5.1. Uniform

X = a real number between a and b (in some instances, X can take on the values a and b ). a = smallest X ; b = largest X

X ~ U(a, b)

The mean is

The standard deviation is

Probability density function: for aXb

Area to the Left of x: P(X < x) = (base)(height)

Area to the Right of x: P(X > x) = (base)(height)

Area Between c and d: P(c < X < d) = (base)(height) = (dc)(height) .


Formula 5.2. Exponential

X ~ Exp (m)

X = a real number, 0 or larger. m = the parameter that controls the rate of decay or decline

The mean and standard deviation are the same.

and

The probability density function: , X ≥ 0

Area to the Left of x: P(X < x) = 1 – e -m⋅x

Area to the Right of x: P(X > x) = e -m⋅x

Area Between c and d:

Percentile, k:


5.6. Practice 1: Uniform Distribution*

Student Learning Outcomes

  • The student will explore the properties of data with a uniform distribution.

Given

The age of cars in the staff parking lot of a suburban college is uniformly distributed from six months (0.5 years) to 9.5 years.

Describe the Data

Exercise 5.6.1. (Go to Solution)

What is being measured here?


Exercise 5.6.2. (Go to Solution)

In words, define the Random Variable X .


Exercise 5.6.3. (Go to Solution)

Are the data discrete or continuous?


Exercise 5.6.4. (Go to Solution)

The interval of values for X  is:


Exercise 5.6.5. (Go to Solution)

The distribution for X  is:


Probability Distribution

Exercise 5.6.6. (Go to Solution)

Write the probability density function.


Exercise 5.6.7. (Go to Solution)

Graph the probability distribution.

a. Sketch the graph of the probability distribution.

Figure 5.8. 

Figure (graph.png)

b. Identify the following values:
i. Lowest value for X :
ii. Highest value for X :
iii. Height of the rectangle:
iv. Label for x-axis (words):
v. Label for y-axis (words):

Random Probability

Exercise 5.6.8. (Go to Solution)

Find the probability that a randomly chosen car in the lot was less than 4 years old.

a. Sketch the graph. Shade the area of interest.

Figure 5.9. 

Blank graph with vertical and horizontal axes.

b. Find the probability. P ( X < 4 ) =

Exercise 5.6.9. (Go to Solution)

Out of just the cars less than 7.5 years old, find the probability that a randomly chosen car in the lot was less than 4 years old.

a. Sketch the graph. Shade the area of interest.

Figure 5.10. 

Figure (graph.png)

b. Find the probability. P ( X < 4 ∣ X < 7 . 5 ) =

Exercise 5.6.10. Discussion Question

What has changed in the previous two problems that made the solutions different?


Quartiles

Exercise 5.6.11. (Go to Solution)

Find the average age of the cars in the lot.


Exercise 5.6.12. (Go to Solution)

Find the third quartile of ages of cars in the lot. This means you will have to find the value such that , or 75%, of the cars are at most (less than or equal to) that age.

a. Sketch the graph. Shade the area of interest.

Figure 5.11. 

Blank graph with vertical and horizontal axes.

b. Find the value k such that P ( X < k ) = 0 . 75 .
c. The third quartile is:

Solutions to Exercises

Solution to Exercise 5.6.1. (Return to Exercise)

The age of cars in the staff parking lot


Solution to Exercise 5.6.2. (Return to Exercise)

X = The age (in years) of cars in the staff parking lot


Solution to Exercise 5.6.3. (Return to Exercise)

 Continuous


Solution to Exercise 5.6.4. (Return to Exercise)

0.5 - 9.5


Solution to Exercise 5.6.5. (Return to Exercise)

X ~ U ( 0 . 5,9 . 5 )


Solution to Exercise 5.6.6. (Return to Exercise)

f ( x )


Solution to Exercise 5.6.7. (Return to Exercise)

b.i. 0.5
b.ii. 9.5
b.iii.
b.iv. Age of Cars
b.v. f ( x )

Solution to Exercise 5.6.8. (Return to Exercise)

b.:

Solution to Exercise 5.6.9. (Return to Exercise)

b:

Solution to Exercise 5.6.11. (Return to Exercise)

μ = 5


Solution to Exercise 5.6.12. (Return to Exercise)

b. k = 7.25

5.7. Practice 2: Exponential Distribution*

Student Learning Outcomes

  • The student will explore the properties of data with a exponential distribution.

Given

Carbon-14 is a radioactive element with a half-life of about 5730 years. Carbon-14 is said to decay exponentially. The decay rate is 0.000121 . We start with 1 gram of carbon-14. We are interested in the time (years) it takes to decay carbon-14.

Describe the Data

Exercise 5.7.1.

What is being measured here?


Exercise 5.7.2. (Go to Solution)

Are the data discrete or continuous?


Exercise 5.7.3. (Go to Solution)

In words, define the Random Variable X .


Exercise 5.7.4. (Go to Solution)

What is the decay rate ( m )?


Exercise 5.7.5. (Go to Solution)

The distribution for X  is:


Probability

Exercise 5.7.6. (Go to Solution)

Find the amount (percent of 1 gram) of carbon-14 lasting less than 5730 years. This means, find P ( X < 5730 ) .

a. Sketch the graph. Shade the area of interest.

Figure 5.12. 

Blank graph with vertical and horizontal axes.

b. Find the probability. P ( X < 5730 ) =

Exercise 5.7.7. (Go to Solution)

Find the percentage of carbon-14 lasting longer than 10,000 years.

a. Sketch the graph. Shade the area of interest.

Figure 5.13. 

Blank graph with horizontal and vertical axes.

b. Find the probability. P ( X > 10000 ) =

Exercise 5.7.8. (Go to Solution)

Thirty percent (30%) of carbon-14 will decay within how many years?

a. Sketch the graph. Shade the area of interest.

Figure 5.14. 

Blank graph with vertical and horizontal axes.

b. Find the value k such that P ( X < k ) = 0 . 30 .

Solutions to Exercises

Solution to Exercise 5.7.2. (Return to Exercise)

 Continuous


Solution to Exercise 5.7.3. (Return to Exercise)

X = Time (years) to decay carbon-14


Solution to Exercise 5.7.4. (Return to Exercise)

m = 0.000121


Solution to Exercise 5.7.5. (Return to Exercise)

X ~ Exp(0.000121)


Solution to Exercise 5.7.6. (Return to Exercise)

b. P ( X < 5730 ) = 0.5001

Solution to Exercise 5.7.7. (Return to Exercise)

b. P ( X > 10000 ) = 0.2982

Solution to Exercise 5.7.8. (Return to Exercise)

b. k = 2947.73

5.8. Homework*

For each probability and percentile problem, DRAW THE PICTURE!

Exercise 5.8.1.

Consider the following experiment. You are one of 100 people enlisted to take part in a study to determine the percent of nurses in America with an R.N.(registered nurse) degree. You ask nurses if they have an R.N. degree. The nurses answer “yes” or “no.” You then calculate the percentage of nurses with an R.N. degree. You give that percentage to your supervisor.

a. What part of the experiment will yield discrete data?
b. What part of the experiment will yield continuous data?

Exercise 5.8.2.

When age is rounded to the nearest year, do the data stay continuous, or do they become discrete? Why?


Exercise 5.8.3. (Go to Solution)

Births are approximately uniformly distributed between the 52 weeks of the year. They can be said to follow a Uniform Distribution from 1 – 53 (spread of 52 weeks).

a. X ~
b. Graph the probability distribution.
c. f ( x ) =
d. μ =
e. σ =
f. Find the probability that a person is born at the exact moment week 19 starts. That is, find P ( X = 19 ) =
g. P ( 2 < X < 31 ) =
h. Find the probability that a person is born after week 40.
i. P ( 12 < XX < 28 ) =
j. Find the 70th percentile.
k. Find the minimum for the upper quarter.

Exercise 5.8.4.

A random number generator picks a number from 1 to 9 in a uniform manner.

a. X ~
b. Graph the probability distribution.
c. f ( x ) =
d. μ =
e. σ =
f. P ( 3 . 5 < X < 7 . 25 ) =
g. P ( X > 5 . 67 ) =
h. P ( X > 5 ∣ X > 3 ) =
i. Find the 90th percentile.

Exercise 5.8.5. (Go to Solution)

The time (in minutes) until the next bus departs a major bus depot follows a distribution with where x goes from 25 to 45 minutes.

a. X =
b. X ~
c. Graph the probability distribution.
d. The distribution is ______________ (name of distribution). It is _____________ (discrete or continuous).
e. μ =
f. σ =
g. Find the probability that the time is at most 30 minutes. Sketch and label a graph of the distribution. Shade the area of interest. Write the answer in a probability statement.
h. Find the probability that the time is between 30 and 40 minutes. Sketch and label a graph of the distribution. Shade the area of interest. Write the answer in a probability statement.
i. P(25 < X < 55) = _________. State this in a probability statement (similar to g and h ), draw the picture, and find the probability.
j. Find the 90th percentile. This means that 90% of the time, the time is less than _____ minutes.
k. Find the 75th percentile. In a complete sentence, state what this means. (See j.)
l. Find the probability that the time is more than 40 minutes given (or knowing that) it is at least 30 minutes.

Exercise 5.8.6.

According to a study by Dr. John McDougall of his live-in weight loss program at St. Helena Hospital, the people who follow his program lose between 6 and 15 pounds a month until they approach trim body weight. Let’s suppose that the weight loss is uniformly distributed. We are interested in the weight loss of a randomly selected individual following the program for one month. (Source: The McDougall Program for Maximum Weight Loss by John A. McDougall, M.D.)

a. X =
b. X ~
c. Graph the probability distribution.
d. f ( x ) =
e. μ =
f. σ =
g. Find the probability that the individual lost more than 10 pounds in a month.
h. Suppose it is known that the individual lost more than 10 pounds in a month. Find the probability that he lost less than 12 pounds in the month.
i. P ( 7 < X < 13 ∣ X > 9 ) = __________. State this in a probability question (similar to g and h), draw the picture, and find the probability.

Exercise 5.8.7. (Go to Solution)

A subway train on the Red Line arrives every 8 minutes during rush hour. We are interested in the length of time a commuter must wait for a train to arrive. The time follows a uniform distribution.

a. X =
b. X ~
c. Graph the probability distribution.
d. f ( x ) =
e. μ =
f. σ =
g. Find the probability that the commuter waits less than one minute.
h. Find the probability that the commuter waits between three and four minutes.
i. 60% of commuters wait more than how long for the train? State this in a probability question (similar to g and h), draw the picture, and find the probability.

Exercise 5.8.8.

The age of a first grader on September 1 at Garden Elementary School is uniformly distributed from 5.8 to 6.8 years. We randomly select one first grader from the class.

a. X =
b. X ~
c. Graph the probability distribution.
d. f ( x ) =
e. μ =
f. σ =
g. Find the probability that she is over 6.5 years.
h. Find the probability that she is between 4 and 6 years.
i. Find the 70th percentile for the age of first graders on September 1 at Garden Elementary School.

Exercise 5.8.9. (Go to Solution)

Let X~Exp(0.1)

a. decay rate=
b. μ =
c. Graph the probability distribution function.
d. On the above graph, shade the area corresponding to P(X < 6) and find the probability.
e. Sketch a new graph, shade the area corresponding to P(3 < X < 6) and find the probability.
f. Sketch a new graph, shade the area corresponding to P(X > 7) and find the probability.
g. Sketch a new graph, shade the area corresponding to the 40th percentile and find the value.
h. Find the average value of X .

Exercise 5.8.10.

Suppose that the length of long distance phone calls, measured in minutes, is known to have an exponential distribution with the average length of a call equal to 8 minutes.

a. X =
b. Is X continuous or discrete?
c. X ~
d. μ =
e. σ =
f. Draw a graph of the probability distribution. Label the axes.
g. Find the probability that a phone call lasts less than 9 minutes.
h. Find the probability that a phone call lasts more than 9 minutes.
i. Find the probability that a phone call lasts between 7 and 9 minutes.
j. If 25 phone calls are made one after another, on average, what would you expect the total to be? Why?

Exercise 5.8.11. (Go to Solution)

Suppose that the useful life of a particular car battery, measured in months, decays with parameter 0.025. We are interested in the life of the battery.

a. X =
b. Is X continuous or discrete?
c. X ~
d. On average, how long would you expect 1 car battery to last?
e. On average, how long would you expect 9 car batteries to last, if they are used one after another?
f. Find the probability that a car battery lasts more than 36 months.
g. 70% of the batteries last at least how long?

Exercise 5.8.12.

The percent of persons (ages 5 and older) in each state who speak a language at home other than English is approximately exponentially distributed with a mean of 9.848 . Suppose we randomly pick a state. (Source: Bureau of the Census, U.S. Dept. of Commerce)

a. X =
b. Is X continuous or discrete?
c. X ~
d. μ =
e. σ =
f. Draw a graph of the probability distribution. Label the axes.
g. Find the probability that the percent is less than 12.
h. Find the probability that the percent is between 8 and 14.
i. The percent of all individuals living in the United States who speak a language at home other than English is 13.8 .
i. Why is this number different from 9.848%?
ii. What would make this number higher than 9.848%?

Exercise 5.8.13. (Go to Solution)

The time (in years) after reaching age 60 that it takes an individual to retire is approximately exponentially distributed with a mean of about 5 years. Suppose we randomly pick one retired individual. We are interested in the time after age 60 to retirement.

a. X =
b. Is X continuous or discrete?
c. X ~
d. μ =
e. σ =
f. Draw a graph of the probability distribution. Label the axes.
g. Find the probability that the person retired after age 70.
h. Do more people retire before age 65 or after age 65?
i. In a room of 1000 people over age 80, how many do you expect will NOT have retired yet?

Exercise 5.8.14.

The cost of all maintenance for a car during its first year is approximately exponentially distributed with a mean of $150.

a. X =
b. X ~
c. μ =
d. σ =
e. Draw a graph of the probability distribution. Label the axes.
f. Find the probability that a car required over $300 for maintenance during its first year.

Try these multiple choice problems

The next three questions refer to the following information. The average lifetime of a certain new cell phone is 3 years. The manufacturer will replace any cell phone failing within 2 years of the date of purchase. The lifetime of these cell phones is known to follow an exponential distribution.

Exercise 5.8.15. (Go to Solution)

The decay rate is

A. 0.3333
B. 0.5000
C. 2.0000
D. 3.0000

Exercise 5.8.16. (Go to Solution)

What is the probability that a phone will fail within 2 years of the date of purchase?

A. 0.8647
B. 0.4866
C. 0.2212
d. 0.9997

Exercise 5.8.17. (Go to Solution)

What is the median lifetime of these phones (in years)?

A. 0.1941
B. 1.3863
C. 2.0794
D. 5.5452

The next three questions refer to the following information. The Sky Train from the terminal to the rental car and long term parking center is supposed to arrive every 8 minutes. The waiting times for the train are known to follow a uniform distribution.

Exercise 5.8.18. (Go to Solution)

What is the average waiting time (in minutes)?

A. 0.0000
B. 2.0000
C. 3.0000
D. 4.0000

Exercise 5.8.19. (Go to Solution)

Find the 30th percentile for the waiting times (in minutes).

A. 2.0000
B. 2.4000
C. 2.750
D. 3.000

Exercise 5.8.20. (Go to Solution)

The probability of waiting more than 7 minutes given a person has waited more than 4 minutes is?

A. 0.1250
B. 0.2500
C. 0.5000
D. 0.7500

Solutions to Exercises

Solution to Exercise 5.8.3. (Return to Exercise)

a. X ~ U ( 1, 53 )
c. where 1 ≤ x ≤ 53
d. 27
e. 15.01
f. 0
g.
h.
i.
j. 37.4
k. 40

Solution to Exercise 5.8.5. (Return to Exercise)

b. X ~ U ( 25 , 45 )
d. uniform; continuous
e. 35 minutes
f. 5.8 minutes
g. 0.25
h. 0.5
i. 1
j. 43 minutes
k. 40 minutes
l. 0.3333

Solution to Exercise 5.8.7. (Return to Exercise)

b. X ~ U ( 0,8 )
d. where 0 ≤ X ≤ 8
e. 4
f. 2.31
g.
h.
i. 3.2

Solution to Exercise 5.8.9. (Return to Exercise)

a. 0.1
b. 10
d. 0.4512
e. 0.1920
f. 0.4966
g. 5.11
h. 10

Solution to Exercise 5.8.11. (Return to Exercise)

c. X ~ Exp ( 0.025 )
d. 40 months
e. 360 months
f. 0.4066
g. 14.27

Solution to Exercise 5.8.13. (Return to Exercise)

c.
d. 5
e. 5
g. 0.1353
h. Before
i. 18.3

Solution to Exercise 5.8.15. (Return to Exercise)

 A


Solution to Exercise 5.8.16. (Return to Exercise)

 B


Solution to Exercise 5.8.17. (Return to Exercise)

 C


Solution to Exercise 5.8.18. (Return to Exercise)

 D


Solution to Exercise 5.8.19. (Return to Exercise)

 B


Solution to Exercise 5.8.20. (Return to Exercise)

 B


5.9. Review*

Exercise 5.9.1.Exercise 5.9.7. refer to the following study: A recent study of mothers of junior high school children in Santa Clara County reported that 76% of the mothers are employed in paid positions. Of those mothers who are employed, 64% work full-time (over 35 hours per week), and 36% work part-time. However, out of all of the mothers in the population, 49% work full-time. The population under study is made up of mothers of junior high school children in Santa Clara County.

Let E = employed, Let F = full-time employment

Exercise 5.9.1. (Go to Solution)

a. Find the percent of all mothers in the population that NOT employed.
b. Find the percent of mothers in the population that are employed part-time.

Exercise 5.9.2. (Go to Solution)

The type of employment is considered to be what type of data?


Exercise 5.9.3. (Go to Solution)

In symbols, what does the 36% represent?


Exercise 5.9.4. (Go to Solution)

Find the probability that a randomly selected person from the population will be employed OR work full-time.


Exercise 5.9.5. (Go to Solution)

Based upon the above information, are being employed AND working part-time:

a. mutually exclusive events? Why or why not?
b. independent events? Why or why not?

Exercise 5.9.6. - Exercise 5.9.7. refer to the following: We randomly pick 10 mothers from the above population. We are interested in the number of the mothers that are employed. Let X = number of mothers that are employed.

Exercise 5.9.6. (Go to Solution)

State the distribution for X .


Exercise 5.9.7. (Go to Solution)

Find the probability that at least 6 are employed.


Exercise 5.9.8. (Go to Solution)

We expect the Statistics Discussion Board to have, on average, 14 questions posted to it per week. We are interested in the number of questions posted to it per day.

a. Define X .
b. What are the values that the random variable may take on?
c. State the distribution for X .
d. Find the probability that from 10 to 14 (inclusive) questions are posted to the Listserv on a randomly picked day.

Exercise 5.9.9. (Go to Solution)

A person invests $1000 in stock of a company that hopes to go public in 1 year.

  • The probability that the person will lose all his money after 1 year (i.e. his stock will be worthless) is 35%.

  • The probability that the person’s stock will still have a value of $1000 after 1 year (i.e. no profit and no loss) is 60%.

  • The probability that the person’s stock will increase in value by $10,000 after 1 year (i.e. will be worth $11,000) is 5%.

Find the expected PROFIT after 1 year.


Exercise 5.9.10. (Go to Solution)

Rachel’s piano cost $3000. The average cost for a piano is $4000 with a standard deviation of $2500. Becca’s guitar cost $550. The average cost for a guitar is $500 with a standard deviation of $200. Matt’s drums cost $600. The average cost for drums is $700 with a standard deviation of $100. Whose cost was lowest when compared to his or her own instrument? Justify your answer.


Exercise 5.9.11. (Go to Solution)

For the following data, which of the measures of central tendency would be the LEAST useful: mean, median, mode? Explain why. Which would be the MOST useful? Explain why.

4,6,6, 12 , 18 , 18 , 18 , 200


Exercise 5.9.12. (Go to Solution)

Horizontal boxplot with first whisker extending from 1 to 2, box from 2 to 5, line at 4, and second whisker extending from 5 to 7.

For each statement below, explain why each is either true or false.

a. 25% of the data are at most 5.
b. There is the same amount of data from 4 – 5 as there is from 5 – 7.
c. There are no data values of 3.
d. 50% of the data are 4.

Exercise 5.9.13.Exercise 5.9.14. refer to the following: 64 faculty members were asked the number of cars they owned (including spouse and children’s cars). The results are given in the following graph: Histogram consisting of 5 bars with number of cars, from 0-7 in increments of 1, on the x-axis, and frequency, in increments of 0.1 from 0.15-0.45, on the y-axis. No bars exist for 4, 5, or 7. Bar 0 has a frequency of 0.075, 1 has 0.15, 2 has 0.45, 3 has 0.25, and 6 has 0.075.

Exercise 5.9.13. (Go to Solution)

Find the approximate number of responses that were “3.”


Exercise 5.9.14. (Go to Solution)

Find the first, second and third quartiles. Use them to construct a box plot of the data.


Exercise 5.9.15.Exercise 5.9.16. refer to the following study done of the Girls soccer team “Snow Leopards”:

Table 5.2.
Hair Style Hair Color 
 blondbrownblack
ponytail325
plain221

Suppose that one girl from the Snow Leopards is randomly selected.

Exercise 5.9.15. (Go to Solution)

Find the probability that the girl has black hair GIVEN that she wears a ponytail.


Exercise 5.9.16. (Go to Solution)

Find the probability that the girl wears her hair plain OR has brown hair.


Exercise 5.9.17. (Go to Solution)

Find the probability that the girl has blond hair AND that she wears her hair plain.


Solutions to Exercises

Solution to Exercise 5.9.1. (Return to Exercise)

a. 24%
b. 27%

Solution to Exercise 5.9.2. (Return to Exercise)

Qualitative


Solution to Exercise 5.9.3. (Return to Exercise)


Solution to Exercise 5.9.4. (Return to Exercise)

0.7336


Solution to Exercise 5.9.5. (Return to Exercise)

a. No,
b. No,

Solution to Exercise 5.9.6. (Return to Exercise)

B ( 10 , 0 . 76 )


Solution to Exercise 5.9.7. (Return to Exercise)

0.9330


Solution to Exercise 5.9.8. (Return to Exercise)

a. X = the number of questions posted to the Statistics Listserv per day
b. x = 0,1,2, . . .
c. X ~ P ( 2 )
d. 0

Solution to Exercise 5.9.9. (Return to Exercise)

$150


Solution to Exercise 5.9.10. (Return to Exercise)

Matt


Solution to Exercise 5.9.11. (Return to Exercise)

Mean


Solution to Exercise 5.9.12. (Return to Exercise)

a. False
b. True
c. False
d. False

Solution to Exercise 5.9.13. (Return to Exercise)

 16


Solution to Exercise 5.9.14. (Return to Exercise)

 2,2,3


Solution to Exercise 5.9.15. (Return to Exercise)


Solution to Exercise 5.9.16. (Return to Exercise)


Solution to Exercise 5.9.17. (Return to Exercise)


5.10. Lab: Continuous Distribution*

Class Time:

Names:

Student Learning Outcomes:

  • The student will compare and contrast empirical data from a random number generator with the Uniform Distribution.

Collect the Data

Use a random number generator to generate 50 values between 0 and 1 (inclusive). List them below. Round the numbers to 4 decimal places or set the calculator MODE to 4 places.

  1. Complete the table:

    Table 5.3.
    __________________________________________________
    __________________________________________________
    __________________________________________________
    __________________________________________________
    __________________________________________________
    __________________________________________________
    __________________________________________________
    __________________________________________________
    __________________________________________________
    __________________________________________________

  2. Calculate the following:

    a.
    b. s =
    c. 1st quartile =
    d. 3rd quartile =
    e. Median =

Organize the Data

  1. Construct a histogram of the empirical data. Make 8 bars.

    Figure 5.15. 

    Blank graph with relative frequency on the vertical axis and X on the horizontal axis.


  2. Construct a histogram of the empirical data. Make 5 bars.

    Figure 5.16. 

    Blank graph with relative frequency on the vertical axis and X on the horizontal axis.


Describe the Data

  1. Describe the shape of each graph. Use 2 – 3 complete sentences. (Keep it simple. Does the graph go straight across, does it have a V shape, does it have a hump in the middle or at either end, etc.? One way to help you determine a shape, is to roughly draw a smooth curve through the top of the bars.)

  2. Describe how changing the number of bars might change the shape.

Theoretical Distribution

  1. In words, X  =

  2. The theoretical distribution of X is X ~ U(0,1). Use it for this part.

  3. In theory, based upon the distribution X ~ U(0,1), complete the following.

    a. μ =
    b. σ =
    c. 1st quartile =
    d. 3rd quartile =
    e. median = __________

  4. Are the empirical values (the data) in the section titled “Collect the Data” close to the corresponding theoretical values above? Why or why not?

Plot the Data

  1. Construct a box plot of the data. Be sure to use a ruler to scale accurately and draw straight edges.

  2. Do you notice any potential outliers? If so, which values are they? Either way, numerically justify your answer. (Recall that any DATA are less than Q1 – 1.5*IQR or more than Q3 + 1.5*IQR are potential outliers. IQR means interquartile range.)

Compare the Data

  1. For each part below, use a complete sentence to comment on how the value obtained from the data compares to the theoretical value you expected from the distribution in the section titled “Theoretical Distribution.”

    a. minimum value:
    b. 1st quartile:
    c. median:
    d. third quartile:
    e. maximum value:
    f. width of IQR:
    g. overall shape:

  2. Based on your comments in the section titled “Collect the Data”, how does the box plot fit or not fit what you would expect of the distribution in the section titled “Theoretical Distribution?”

Discussion Question

  1. Suppose that the number of values generated was 500, not 50. How would that affect what you would expect the empirical data to be and the shape of its graph to look like?