# 4.1. Discrete Random Variables*

## Student Learning Objectives

By the end of this chapter, the student should be able to:

• Recognize and understand discrete probability distribution functions, in general.

• Calculate and interpret expected values.

• Recognize the binomial probability distribution and apply it appropriately.

• Recognize the Poisson probability distribution and apply it appropriately (optional).

• Recognize the geometric probability distribution and apply it appropriately (optional).

• Recognize the hypergeometric probability distribution and apply it appropriately (optional).

• Classify discrete word problems by their distributions.

## Introduction

A student takes a 10 question true-false quiz. Because the student had such a busy schedule, he or she could not study and randomly guesses at each answer. What is the probability of the student passing the test with at least a 70%?

Small companies might be interested in the number of long distance phone calls their employees make during the peak time of the day. Suppose the average is 20 calls. What is the probability that the employees make more than 20 long distance phone calls during the peak time?

These two examples illustrate two different types of probability problems involving discrete random variables. Recall that discrete data are data that you can count. A random variable describes the outcomes of a statistical experiment both in words. The values of a random variable can vary with each repetition of an experiment.

In this chapter, you will study probability problems involving discrete random distributions. You will also study long-term averages associated with them.

## Random Variable Notation

Upper case letters like X or Y denote a random variable. Lower case letters like x or y denote the value of a random variable. If X is a random variable, then X is defined in words.

For example, let X = the number of heads you get when you toss three fair coins. The sample space for the toss of three fair coins is TTT; THH; HTH; HHT; HTT; THT; TTH; HHH. Then, x = 0, 1, 2, 3. X is in words and x is a number. Notice that for this example, the x values are countable outcomes. Because you can count the possible values that X can take on and the outcomes are random (the x values 0, 1, 2, 3), X is a discrete random variable.

## Optional Collaborative Classroom Activity

Toss a coin 10 times and record the number of heads. After all members of the class have completed the experiment (tossed a coin 10 times and counted the number of heads), fill in the chart using a heading like the one below. Let X = the number of heads in 10 tosses of the coin.

Table 4.1.
X Frequency of X Relative Frequency of X

• Which value(s) of X occurred most frequently?

• If you tossed the coin 1,000 times, what values would X take on? Which value(s) of X do you think would occur most frequently?

• What does the relative frequency column sum to?

## Glossary

Random Variable (RV)

see Variable

Variable (Random Variable)

A characteristic of interest in a population being studied. Common notation for variables are upper case Latin letters X , Y , Z ,…; common notation for a specific value from the domain (set of all possible values of a variable) are lower case Latin letters x , y , z ,…. For example, if X is the number of children in a family, then x represents a specific integer 0, 1, 2, 3, …. Variables in statistics differ from variables in intermediate algebra in two following ways.

• The domain of the random variable (RV) is not necessarily a numerical set; the domain may be expressed in words; for example, if X = hair color then the domain is {black, blond, gray, green, orange}.

• We can tell what specific value x of the Random Variable X takes only after performing the experiment.

# 4.2. Probability Distribution Function (PDF) for a Discrete Random Variable*

A discrete probability distribution function has two characteristics:

• Each probability is between 0 and 1, inclusive.

• The sum of the probabilities is 1.

P(X) is the notation used to represent a discrete probability distribution function.

Example 4.1.

A child psychologist is interested in the number of times a newborn baby’s crying wakes its mother after midnight. For a random sample of 50 mothers, the following information was obtained. Let X = the number of times a newborn wakes its mother after midnight. For this example, x = 0, 1, 2, 3, 4, 5.

P(X = x) = probability that X takes on a value x .

 x P(X = x) 0 1 2 3 4 5 X takes on the values 0, 1, 2, 3, 4, 5. This is a discrete PDF  because

1. Each P(X = x) is between 0 and 1, inclusive.

2. The sum of the probabilities is 1, that is, Example 4.2.

Suppose Nancy has classes 3 days a week. She attends classes 3 days a week 80% of the time, 2 days 15% of the time, 1 day 4% of the time, and no days 1% of the time.

Problem 1. (Go to Solution)

Let X = the number of days Nancy ____________________ .

Problem 2. (Go to Solution)

X takes on what values?

Problem 3. (Go to Solution)

Construct a probability distribution table (called a PDF table) like the one in the previous example. The table should have two columns labeled x and P(X = x). What does the P(X = x) column sum to?

## Solutions to Exercises

Let X = the number of days Nancy attends class per week.

0, 1, 2, and 3

 x P(X = x) 0 0.01 1 0.04 2 0.15 3 0.80

## Glossary

Probability Distribution Function (PDF)

A mathematical description of a discrete random variable (RV), given either in the form of an equation (formula) , or in the form of a table listing all the possible outcomes of an experiment and the probability associated with each outcome.

Example .

A biased coin with probability 0.7 for a head (in one toss of the coin) is tossed 5 times. We are interested in the number of heads (the RV X = the number of heads). X is Binomial, so XB ( 5 , 0 . 7 ) and P ( X = x ) = or in the form of the table:

Table 4.4.
x P ( X = x )
00.0024
10.0284
20.1323
30.3087
40.3602
50.1681

# 4.3. Mean or Expected Value and Standard Deviation*

The expected value is often referred to as the “long-term”average or mean . This means that over the long term of doing an experiment over and over, you would expect this average.

The mean of a random variable X is μ . If we do an experiment many times (for instance, flip a fair coin, as Karl Pearson did, 24,000 times and let X = the number of heads) and record the value of X each time, the average gets closer and closer to μ as we keep repeating the experiment. This is known as the Law of Large Numbers.

### Note

To find the expected value or long term average, μ , simply multiply each value of the random variable by its probability and add the products.

A Step-by-Step Example

A men’s soccer team plays soccer 0, 1, or 2 days a week. The probability that they play 0 days is 0.2, the probability that they play 1 day is 0.5, and the probability that they play 2 days is 0.3. Find the long-term average, μ , or expected value of the days per week the men’s soccer team plays soccer.

To do the problem, first let the random variable X = the number of days the men’s soccer team plays soccer per week. X takes on the values 0, 1, 2. Construct a PDF table, adding a column xP(X=x). In this column, you will multiply each x value by its probability.

Table 4.5. Expected Value Table
This table is called an expected value table. The table helps you calculate the expected value or long-term average.
x P(X=x) x P(X=x)
00.2(0)(0.2) = 0
10.5(1)(0.5) = 0.5
20.3(2)(0.3) = 0.6

Add the last column to find the long term average or expected value: (0)(0.2)+(1)(0.5)+(2)(0.3)= 0 + 0.5. 0.6 = 1.1.

The expected value is 1.1. The men’s soccer team would, on the average, expect to play soccer 1.1 days per week. The number 1.1 is the long term average or expected value if the men’s soccer team plays soccer week after week after week. We say μ=1.1

Example 4.4.

Find the expected value for the example about the number of times a newborn baby’s crying wakes its mother after midnight. The expected value is the expected number of times a newborn wakes its mother after midnight.

Table 4.6. You expect a newborn to wake its mother after midnight 2.1 times, on the average.
x P(X=x) x P(X=x)
0 (0) = 0
1 (1) = 2 (2) = 3 (3) = 4 (4) = 5 (5) = Add the last column to find the expected value. μ = Expected Value = Problem

Go back and calculate the expected value for the number of days Nancy attends classes a week. Construct the third column to do so.

Solution

2.74 days a week.

Example 4.5.

Suppose you play a game of chance in which you choose 5 numbers from 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. You may choose a number more than once. You pay \$2 to play and could profit \$100,000 if you match all 5 numbers in order (you get your \$2 back plus \$100,000). Over the long term, what is your expected profit of playing the game?

To do this problem, set up an expected value table for the amount of money you can profit.

Let X = the amount of money you profit. The values of x are not 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. Since you are interested in your profit (or loss), the values of x are 100,000 dollars and -2 dollars.

To win, you must get all 5 numbers correct, in order. The probability of choosing one correct number is because there are 10 numbers. You may choose a number more than once. The probability of choosing all 5 numbers correctly and in order is:

(4.1) Therefore, the probability of winning is 0.00001 and the probability of losing is

(4.2)1 – 0.00001 = 0.99999

The expected value table is as follows.

 x P(X=x) xP(X=x) Loss -2 0.99999 (-2)(0.99999)=-1.99998 Profit 100,000 0.00001 (100000)(0.00001)=1

Since -0.99998 is about -1, you would, on the average, expect to lose approximately one dollar for each game you play. However, each time you play, you either lose \$2 or profit \$100,000. The \$1 is the average or expected LOSS per game after playing this game over and over.

Example 4.6.

Suppose you play a game with a biased coin. You play each game by tossing the coin once. and . If you toss a head, you pay \$6. If you toss a tail, you win \$10. If you play this game many times, will you come out ahead?

Problem 1. (Go to Solution)

Define a random variable X .

Problem 2. (Go to Solution)

Complete the following expected value table.

 x ____ ____ WIN 10 ____ LOSE ____ ____ Problem 3. (Go to Solution)

What is the expected value, μ ? Do you come out ahead?

Like data, probability distributions have standard deviations. To calculate the standard deviation ( σ ) of a probability distribution, find each deviation, square it, multiply it by its probability, add the products, and take the square root . To understand how to do the calculation, look at the table for the number of days per week a men’s soccer team plays soccer. To find the standard deviation, add the entries in the column labeled and take the square root.

 x P(X=x) xP(X=x) (x -μ)2 P(X=x) 0 0.2 (0)(0.2) = 0 1 0.5 (1)(0.5) = 0.5 2 0.3 (2)(0.3) = 0.6 Add the last column in the table. 0.242 + 0.005 + 0.243 = 0.490. The standard deviation is the square root of 0.49. Generally for probability distributions, we use a calculator or a computer to calculate μ and σ to reduce roundoff error. For some probability distributions, there are short-cut formulas that calculate μ and σ .

## Solutions to Exercises

X = amount of profit

 x P(X=x) xP(X=x) WIN 10  LOSE -6  Add the last column of the table. The expected value . You lose, on average, about 67 cents each time you play the game so you do not come out ahead.

## Glossary

Expected Value

Expected arithmetic average when an experiment is repeated many times. (Also called the mean). Notations: E(x),μ. For a discrete random variable (RV) with probability distribution function P(x),the definition can also be written in the form E(x) = μ = ∑xP(X = x).

Mean

A number that measures the central tendency. A common name for mean is ‘average.’ The term ‘mean’ is a shortened form of ‘arithmetic mean.’ By definition, the mean for a sample (denoted by ) is , and the mean for a population (denoted by μ ) is .

# 4.4. Common Discrete Probability Distribution Functions*

Some of the more common discrete probability functions are binomial, geometric, hypergeometric, and Poisson. Most elementary courses do not cover the geometric, hypergeometric, and Poisson. Your instructor will let you know if he or she wishes to cover these distributions.

A probability distribution function is a pattern. You try to fit a probability problem into a pattern or distribution in order to perform the necessary calculations. These distributions are tools to make solving probability problems easier. Each distribution has its own special characteristics. Learning the characteristics enables you to distinguish among the different distributions.

# 4.5. Binomial*

The characteristics of a binomial experiment are:

1. There are a fixed number of trials. Think of trials as repetitions of an experiment. The letter n denotes the number of trials.

2. There are only 2 possible outcomes, called “success” and, “failure” for each trial. The letter p denotes the probability of a success on one trial and q denotes the probability of a failure on one trial. p + q = 1.

3. The n trials are independent and are repeated using identical conditions. Because the n trials are independent, the outcome of one trial does not affect the outcome of any other trial. Another way of saying this is that for each individual trial, the probability, p , of a success and probability, q , of a failure remain the same. For example, randomly guessing at a true - false statistics question has only two outcomes. If a success is guessing correctly, then a failure is guessing incorrectly. Suppose Joe always guesses correctly on any statistics true - false question with probability p = 0.6. Then, q = 0.4 .This means that for every true - false statistics question Joe answers, his probability of success ( p = 0.6) and his probability of failure ( q = 0.4) remain the same.

The outcomes of a binomial experiment fit a binomial probability distribution. The random variable X = the number of successes obtained in the n independent trials.

The mean, μ , and variance, σ 2 , for the binomial probability distribution is μ = np and σ 2 = npq. The standard deviation, σ , is then .

Any experiment that has characteristics 2 and 3 is called a Bernoulli Trial (named after Jacob Bernoulli who, in the late 1600s, studied them extensively). A binomial experiment takes place when the number of successes is counted in one or more Bernoulli Trials.

Example 4.7.

At ABC College, the withdrawal rate from an elementary physics course is 30% for any given term. This implies that, for any given term, 70% of the students stay in the class for the entire term. A “success” could be defined as an individual who withdrew. The random variable is X = the number of students who withdraw from the elementary physics course per term.

Example 4.8.

Suppose you play a game that you can only either win or lose. The probability that you win any game is 55% and the probability that you lose is 45%. If you play the game 20 times, what is the probability that you win 15 of the 20 games? Here, if you define X = the number of wins, then X takes on the values X = 0, 1, 2, 3, …, 20. The probability of a success is p = 0.55. The probability of a failure is q = 0.45. The number of trials is n = 20. The probability question can be stated mathematically as P (X = 15).

Example 4.9.

A fair coin is flipped 15 times. What is the probability of getting more than 10 heads? Let X = the number of heads in 15 flips of the fair coin. X takes on the values x = 0, 1, 2, 3, …, 15. Since the coin is fair, p = 0.5 and q = 0.5. The number of trials is n = 15. The probability question can be stated mathematically as P( X > 10).

Example 4.10.

Approximately 70% of statistics students do their homework in time for it to be collected and graded. In a statistics class of 50 students, what is the probability that at least 40 will do their homework on time?

Problem 1. (Go to Solution)

This is a binomial problem because there is only a success or a __________, there are a definite number of trials, and the probability of a success is 0.70 for each trial.

Problem 2. (Go to Solution)

If we are interested in the number of students who do their homework, then how do we define X ?

Problem 3. (Go to Solution)

What values does X take on?

Problem 4. (Go to Solution)

What is a “failure”, in words?

The probability of a success is p = 0.70. The number of trial is n = 50.

Problem 5. (Go to Solution)

If p + q = 1, then what is q ?

Problem 6. (Go to Solution)

The words “at least” translate as what kind of inequality?

The probability question is P( X ≥ 40).

## Notation for the Binomial: B = Binomial Probability Distribution Function

X ~ B( n,p)

Read this as “ X is a random variable with a binomial distribution.” The parameters are n and p . n = number of trials p = probability of a success on each trial

Example 4.11.

It has been stated that about 41% of adult workers have a high school diploma but do not pursue any further education. If 20 adult workers are randomly selected, find the probability that at most 12 of them have a high school diploma but do not pursue any further education. How many adult workers do you expect to have a high school diploma but do not pursue any further education?

Let X = the number of workers who have a high school diploma but do not pursue any further education.

X takes on the values 0, 1, 2, …, 20 where n = 20 and p = 0.41. q = 1 - 0.41 = 0.59. X ~ B( 20,0.41)

Find P( X ≤ 12). P(X ≤ 12) = 0.9738. (calculator or computer)

Using the TI-83+ or the TI-84 calculators, the calculations are as follows. Go into 2nd DISTR. The syntax for the instructions are

To calculate ( X = value): binompdf( n , p , number) If “number” is left out, the result is the binomial probability table.

To calculate P(X ≤ value): binomcdf( n , p , number) If “number” is left out, the result is the cumulative binomial probability table.

For this problem: After you are in 2nd DISTR, arrow down to A:binomcdf. Press ENTER. Enter 20,.41,12). The result is P(X ≤ 12) = 0.9738.

### Note

If you want to find P (X = 12), use the pdf (0:binompdf). If you want to find P (X > 12), use 1 - binomcdf(20,.41,12).

The probability at most 12 workers have a high school diploma but do not pursue any further education is 0.9738

The graph of X ~ B( 20,0.41)  is: The y-axis contains the probability of X , where X = the number of workers who have only a high school diploma.

The number of adult workers that you expect to have a high school diploma but not pursue any further education is the mean, μ = np = (20) (0.41) = 8.2.

The formula for the variance is σ 2 = npq. The standard deviation is . .

Example 4.12.

The following example illustrates a problem that is not binomial. It violates the condition of independence. ABC College has a student advisory committee made up of 10 staff members and 6 students. The committee wishes to choose a chairperson and a recorder. What is the probability that the chairperson and recorder are both students? All names of the committee are put into a box and two names are drawn without replacement. The first name drawn determines the chairperson and the second name the recorder. There are two trials. However, the trials are not independent because the outcome of the first trial affects the outcome of the second trial. The probability of a student on the first draw is . The probability of a student on the second draw is , when the first draw produces a student. The probability is when the first draw produces a staff member. The probability of drawing a student’s name changes for each of the trials and, therefore, violates the condition of independence.

## Solutions to Exercises

failure

X = the number of statistics students who do their homework on time

0, 1, 2, …, 50

Failure is a student who does not do his or her homework on time.

q = 0.30

greater than or equal to (≥)

## Glossary

Bernoulli Trials

An experiment with the following characteristics:

• There are only 2 possible outcomes called “success” and “failure” for each trial.

• The probability p of a success is the same for any trial (so the probability q = 1 – p of a failure is the same for any trial).

Binomial Distribution

A discrete random variable (RV) which arises from Bernoulli trials. There are a fixed number, n , of independent trials. “Independent” means that the result of any trial (for example, trial 1) does not affect the results of the following trials, and all trials are conducted under the same conditions. Under these circumstances the binomial RV X is defined as the number of successes in n trials. The notation is: X ~ B ( n , p ) . The mean is μ = np and the standard deviation is . The probability of exactly x successes in n trials is .

# 4.6. Geometric (optional)*

The characteristics of a geometric experiment are:

1. There are one or more Bernoulli trials with all failures except the last one, which is a success. In other words, you keep repeating what you are doing until the first success. Then you stop. For example, you throw a dart at a bull’s eye until you hit the bull’s eye. The first time you hit the bull’s eye is a “success” so you stop throwing the dart. It might take you 6 tries until you hit the bull’s eye. You can think of the trials as failure, failure, failure, failure, failure, success. STOP.

2. In theory, the number of trials could go on forever. There must be at least one trial.

3. The probability, p , of a success and the probability, q , of a failure is the same for each trial. p + q = 1 and q = 1 – p . For example, the probability of rolling a 3 when you throw one fair die is . This is true no matter how many times you roll the die. Suppose you want to know the probability of getting the first 3 on the fifth roll. On rolls 1, 2, 3, and 4, you do not get a face with a 3. The probability for each of rolls 1, 2, 3, and 4 is , the probability of a failure. The probability of getting a 3 on the fifth roll is The outcomes of a geometric experiment fit a geometric probability distribution. The random variable X = the number of independent trials until the first success. The mean and variance are in the summary in this chapter.

Example 4.13.

You play a game of chance that you can either win or lose (there are no other possibilities) until you lose. Your probability of losing is p = 0.57. What is the probability that it takes 5 games until you lose? Let X = the number of games you play until you lose (includes the losing game). Then X takes on the values 1, 2, 3, … (could go on indefinitely). The probability question is P(X = 5).

Example 4.14.

A safety engineer feels that 35% of all industrial accidents in her plant are caused by failure of employees to follow instructions. She decides to look at the accident reports until she finds one that shows an accident caused by failure of employees to follow instructions. On the average, how many reports would the safety engineer expect to look at until she finds a report showing an accident caused by employee failure to follow instructions? What is the probability that the safety engineer will have to examine at least 3 reports until she finds a report showing an accident caused by employee failure to follow instructions?

Let X = the number of accidents the safety engineer must examine until she finds a report showing an accident caused by employee failure to follow instructions. X takes on the values 1, 2, 3, …. The first question asks you to find the expected value or the mean. The second question asks you to find P(X ≥ 3). (“At least” translates as a “greater than or equal to” symbol).

Example 4.15.

Suppose that you are looking for a chemistry lab partner. The probability that someone agrees to be your lab partner is 0.55. Since you need a lab partner very soon, you ask every chemistry student you are acquainted with until one says that he/she will be your lab partner. What is the probability that the fourth person says yes?

This is a geometric problem because you may have a number of failures before you have the one success you desire. Also, the probability of a success stays the same each time you ask a chemistry student to be your lab partner. There is no definite number of trials (number of times you ask a chemistry student to be your partner).

Problem 1.

Let X = the number of ____________ you must ask ____________ one says yes.

Solution

Let X = the number of chemistry students you must ask until one says yes.

Problem 2. (Go to Solution)

What values does X take on?

Problem 3. (Go to Solution)

What are p and q ?

Problem 4. (Go to Solution)

The probability question is P(_______).

## Notation for the Geometric: G = Geometric Probability Distribution Function

X ~ G(p)

Read this as “ X is a random variable with a geometric distribution.” The parameter is p . p = the probability of a success for each trial.

Example 4.16.

Assume that the probability of a defective computer component is 0.02. Find the probability that the first defect is caused by the 7th component tested. How many components do you expect to test until one is found to be defective?

Let X = the number of computer components tested until the first defect is found.

X takes on the values 1, 2, 3, … where p = 0.02. X ~ G(0.02)

Find P(X = 7). P(X = 7) = 0.0177. (calculator or computer)

TI-83+ and TI-84: For a general discussion, see this example (binomial). The syntax is similar. The geometric parameter list is (p, number) If “number” is left out, the result is the geometric probability table. For this problem: After you are in 2nd DISTR, arrow down to D:geometpdf. Press ENTER. Enter .02,7). The result is P(X = 7) = 0.0177.

The probability that the 7th component is the first defect is 0.0177.

The graph of X ~ G(0.02) is: The y -axis contains the probability of X , where X = the number of computer components tested.

The number of components that you would expect to test until you find the first defective one is the mean, μ = 50.

The formula for the mean is The formula for the variance is The standard deviation is ## Solutions to Exercises

1, 2, 3, …, (total number of chemistry students)

• p = 0.55

• q = 0.45

P ( X  =  4 )

## Glossary

Geometric Distribution

A discrete random variable (RV) which arises from the Bernoulli trials. The trials are repeated until the first success. The geometric variable X is defined as the number of trials until the first success. Notation: X G ( p ) . The mean is and the standard deviation is The probability of exactly x failures before the first success is given by the formula: P(X = x) = p(1 − p) x − 1 .

# 4.7. Hypergeometric (optional)*

The characteristics of a hypergeometric experiment are:

1. You take samples from 2 groups.

2. You are concerned with a group of interest, called the first group.

3. You sample without replacement from the combined groups. For example, you want to choose a softball team from a combined group of 11 men and 13 women. The team consists of 10 players.

4. Each pick is not independent, since sampling is without replacement. In the softball example, the probability of picking a women first is . The probability of picking a man second is if a woman was picked first. It is if a man was picked first. The probability of the second pick depends on what happened in the first pick.

5. You are not dealing with Bernoulli Trials.

The outcomes of a hypergeometric experiment fit a hypergeometric probability distribution. The random variable X = the number of items from the group of interest. The mean and variance are given in the summary.

Example 4.17.

A candy dish contains 100 jelly beans and 80 gumdrops. Fifty candies are picked at random. What is the probability that 35 of the 50 are gumdrops? The two groups are jelly beans and gumdrops. Since the probability question asks for the probability of picking gumdrops, the group of interest (first group) is gumdrops. The size of the group of interest (first group) is 80. The size of the second group is 100. The size of the sample is 50 (jelly beans or gumdrops). Let X = the number of gumdrops in the sample of 50. X takes on the values x = 0, 1, 2, …, 50. The probability question is P(X = 35).

Example 4.18.

Suppose a shipment of 100 VCRs is known to have 10 defective VCRs. An inspector chooses 12 for inspection. He is interested in determining the probability that, among the 12, at most 2 are defective. The two groups are the 90 non-defective VCRs and the 10 defective VCRs. The group of interest (first group) is the defective group because the probability question asks for the probability of at most 2 defective VCRs. The size of the sample is 12 VCRs. (They may be non-defective or defective.) Let X = the number of defective VCRs in the sample of 12. X takes on the values 0, 1, 2, …, 10. X may not take on the values 11 or 12. The sample size is 12, but there are only 10 defective VCRs. The inspector wants to know P(X ≤ 2) (“At most” means “less than or equal to”).

Example 4.19.

You are president of an on-campus special events organization. You need a committee of 7 to plan a special birthday party for the president of the college. Your organization consists of 18 women and 15 men. You are interested in the number of men on your committee. What is the probability that your committee has more than 4 men?

This is a hypergeometric problem because you are choosing your committee from two groups (men and women).

Problem 1. (Go to Solution)

Are you choosing with or without replacement?

Problem 2. (Go to Solution)

What is the group of interest?

Problem 3. (Go to Solution)

How many are in the group of interest?

Problem 4. (Go to Solution)

How many are in the other group?

Problem 5. (Go to Solution)

Let X = _________ on the committee. What values does X take on?

Problem 6. (Go to Solution)

The probability question is P(_______).

## Notation for the Hypergeometric: H = Hypergeometric Probability Distribution Function

X ~H(r, b, n)

Read this as “ X is a random variable with a hypergeometric distribution.” The parameters are r , b , and n . r = the size of the group of interest (first group), b = the size of the second group, n = the size of the chosen sample

Example 4.20.

A school site committee is to be chosen from 6 men and 5 women. If the committee consists of 4 members, what is the probability that 2 of them are men? How many men do you expect to be on the committee?

Let X = the number of men on the committee of 4. The men are the group of interest (first group).

X takes on the values 0, 1, 2, 3, 4, where r = 6 , b = 5 , and n = 4 . X ~ H(6, 5, 4)

Find P (X = 2 ). P (X = 2 ) = 0.4545 (calculator or computer)

### Note

Currently, the TI-83+ and TI-84 do not have hypergeometric probability functions. There are a number of computer packages, including Microsoft Excel, that do.

The probability that there are 2 men on the committee is about 0.45.

The graph of X ~H(6, 5, 4) is: The y -axis contains the probability of X , where X = the number of men on the committee.

You would expect m = 2.18(about 2) men on the committee.

The formula for the mean is The formula for the variance is fairly complex. You will find it in the Summary of the Discrete Probability Functions Chapter.

## Solutions to Exercises

Without

The men

15 men

18 women

Let X = the number of men on the committee. X = 0, 1, 2, …, 7.

P(X>4)

## Glossary

Hypergeometric Distribution

A discrete random variable (RV) that is characterized by

• A fixed number of trials.

• The probability of success is not the same from trial to trial.

We sample from two groups of items when we are interested in only one group. X is defined as the number of successes out of the total number of items chosen. Notation: X~H(r , b , n)., where r = the number of items in the group of interest, b = the number of items in the group not of interest, and n = the number of items chosen.

# 4.8. Poisson*

Characteristics of a Poisson experiment are:

1. You are interested in the number of times something happens in a certain interval. For example, a book editor might be interested in the number of words spelled incorrectly in a particular book. It might be that, on the average, there are 5 words spelled incorrectly in 100 pages. The interval is the 100 pages.

2. The Poisson may be derived from the binomial if the probability of success is “small” (such as 0.01) and the number of trials is “large” (such as 1000). You will verify the relationship in the homework exercises. n is the number of trials and p is the probability of a “success.”

The outcomes of a Poisson experiment fit a Poisson probability distribution. The random variable X = the number of occurrences in the interval of interest. The mean and variance are given in the summary.

Example 4.21.

The average number of loaves of bread put on a shelf in a bakery in a half-hour period is 12. What is the probability that the number of loaves put on the shelf in 5 minutes is 3? Of interest is the number of loaves of bread put on the shelf in 5 minutes. The time interval of interest is 5 minutes.

Let X = the number of loaves of bread put on the shelf in 5 minutes. If the average number of loaves put on the shelf in 30 minutes (half-hour) is 12, then the average number of loaves put on the shelf in 5 minutes is loaves of bread

The probability question asks you to find P(X = 3).

Example 4.22.

A certain bank expects to receive 6 bad checks per day. What is the probability of the bank getting fewer than 5 bad checks on any given day? Of interest is the number of checks the bank receives in 1 day, so the time interval of interest is 1 day. Let X = the number of bad checks the bank receives in one day. If the bank expects to receive 6 bad checks per day then the average is 6 checks per day. The probability question asks for P(X < 5) .

Example 4.23.

Your math instructor expects you to complete 2 pages of written math homework every day. What is the probability that you complete more than 2 pages a day?

This is a Poisson problem because your instructor is interested in knowing the number of pages of written math homework you complete in a day.

Problem 1. (Go to Solution)

What is the interval of interest?

Problem 2. (Go to Solution)

What is the average number of pages you should do in one day?

Problem 3. (Go to Solution)

Let X = ____________. What values does X take on?

Problem 4. (Go to Solution)

The probability question is P(______).

## Notation for the Poisson: P = Poisson Probability Distribution Function

X ~ P(μ)

Read this as “ X is a random variable with a Poisson distribution.” The parameter is μ (or λ ). μ (or λ ) = the mean for the interval of interest.

Example 4.24.

Leah’s answering machine receives about 6 telephone calls between 8 a.m. and 10 a.m. What is the probability that Leah receives more than 1 call in the next 15 minutes?

Let X = the number of calls Leah receives in 15 minutes. (The interval of interest is 15 minutes or hour.)

X takes on the values 0, 1, 2, 3, …

If Leah receives, on the average, 6 telephone calls in 2 hours, and there are eight 15 minutes intervals in 2 hours, then Leah receives calls in 15 minutes, on the average. So, μ = 0.75 for this problem.

X ~ P(0.75)

Find P(X > 1) . P(X > 1) = 0.1734 (calculator or computer)

TI-83+ and TI-84: For a general discussion, see this example (Binomial) . The syntax is similar. The Poisson parameter list is ( μ for the interval of interest, number). For this problem:

Press 1- and then press 2nd DISTR. Arrow down to C:poissoncdf. Press ENTER. Enter .75,1). The result is P(X > 1) = 0.1734 . NOTE: The TI calculators use λ (lambda) for the mean.

The probability that Leah receives more than 1 telephone call in the next fifteen minutes is about 0.1734.

The graph of X ~ P(0.75) is: The y-axis contains the probability of X where X = the number of calls in 15 minutes.

## Solutions to Exercises

One day

2

Let X = the number of pages of written math homework you do per day.

P(X > 2)

## Glossary

Poisson Distribution

A discrete random variable (RV) that counts the number of times a certain event will occur in a specific interval. Characteristics of the variable:

• The probability that the event occurs in a given interval is the same for all intervals.

• The events occur with a known mean and independently of the time since the last event.

The distribution is defined by the mean μ of the event in the interval. Notation: X~P(μ). The mean is . The standard deviation is σ = μ . The probability of having exactly x successes in r trials is . The Poisson distribution is often used to approximate the binomial distribution when n is “large” and p is “small” (a general rule is that n should be greater than or equal to 20 and p should be less than or equal to .05).

# 4.9. Summary of Functions*

Formula 4.1. Binomial

X ~ B(n,p)

X = the number of successes in n independent trials

n = the number of independent trials

X takes on the values x = 0,1, 2, 3, …, n

p = the probability of a success for any trial

q = the probability of a failure for any trial The mean is μ = np. The standard deviation is .

Formula 4.2. Geometric

X ~ G (p)

X = the number of independent trials until the first success (count the failures and the first success)

X takes on the values x = 1, 2, 3, …

p = the probability of a success for any trial

q = the probability of a failure for any trial

p + q = 1

q = 1 – p

The mean is Τhe standard deviation is Formula 4.3. Hypergeometric

X ~ H (r, b, n)

X = the number of items from the group of interest that are in the chosen sample.

X may take on the values x = 0, 1, …, up to the size of the group of interest. (The minimum value for X may be larger than 0 in some instances.)

r = the size of the group of interest (first group)

b = the size of the second group

n = the size of the chosen sample.

nr + b

The mean is: The standard deviation is: Formula 4.4. Poisson

X ~ P(μ)

X = the number of occurrences in the interval of interest

X takes on the values x = 0, 1, 2, 3, …

The mean μ is typically given. ( λ is often used as the mean instead of μ .) When the Poisson is used to approximate the binomial, we use the binomial mean μ = n p . n is the binomial number of trials. p = the probability of a success for each trial. This formula is valid when n is “large” and p “small” (a general rule is that n should be greater than or equal to 20 and p should be less than or equal to 0.05). If n is large enough and p is small enough then the Poisson approximates the binomial very well. The standard deviation is σ = μ .

# 4.10. Practice 1: Discrete Distribution*

## Student Learning Objectives

• The student will investigate the properties of a discrete distribution.

## Given:

A ballet instructor is interested in knowing what percent of each year’s class will continue on to the next, so that she can plan what classes to offer. Over the years, she has established the following probability distribution.

• Let X = the number of years a student will study ballet with the teacher.

• Let P(X = x) = the probability that a student will study ballet x years.

## Organize the Data

Complete the table below using the data provided.

Table 4.11.
xP(X=x)x*P(X=x)
10.10
20.05
30.10
4
50.30
60.20
70.10

Exercise 4.10.1.

In words, define the Random Variable X .

Exercise 4.10.2.

P(X = 4) =

Exercise 4.10.3.

P ( X < 4 ) =

Exercise 4.10.4.

On average, how many years would you expect a child to study ballet with this teacher?

## Discussion Question

Exercise 4.10.5.

What does the column “P(X=x)” sum to and why?

Exercise 4.10.6.

What does the column “ x * P(X=x)” sum to and why?

# 4.11. Practice 2: Binomial Distribution*

## Student Learning Outcomes

• The student will practice constructing Binomial Distributions.

## Given

The Higher Education Research Institute at UCLA surveyed more than 263,000 incoming freshmen from 385 colleges. 36.7% of first-generation college students expected to work fulltime while in college. (Source: Eric Hoover, The Chronicle of Higher Education, 2/3/2006). Suppose that you randomly pick 8 first-generation college freshmen from the survey. You are interested in the number that expects to work full-time while in college.

## Interpret the Data

Exercise 4.11.1. (Go to Solution)

In words, define the random Variable X.

Exercise 4.11.2. (Go to Solution)

X ~___________

Exercise 4.11.3. (Go to Solution)

What values does X take on?

Exercise 4.11.4.

Construct the probability distribution function (PDF) for X .

 x P(X=x)

Exercise 4.11.5. (Go to Solution)

On average ( u ) , how many would you expect to answer yes?

Exercise 4.11.6. (Go to Solution)

What is the standard deviation ( σ )  ?

Exercise 4.11.7. (Go to Solution)

What is the probability that at most 5 of the freshmen expect to work full-time?

Exercise 4.11.8. (Go to Solution)

What is the probability that at least 2 of the freshmen expect to work full-time?

Exercise 4.11.9.

Construct a histogram or plot a line graph. Label the horizontal and vertical axes with words. Include numerical scaling. ## Solutions to Exercises

X = the number that expect to work full-time.

B (8,0.367)

0,1,2,3,4,5,6,7,8

2.94

1.36

0.9677

0.8547

# 4.12. Practice 3: Poisson Distribution*

## Student Learning Objectives

• The student will investigate the properties of a Poisson distribution.

## Given

On average, ten teens are killed in the U.S. in teen-driven autos per day (USA Today, 3/1/2005). As a result, states across the country are debating raising the driving age.

## Interpret the Data

Exercise 4.12.1.

In words, define the Random Variable X .

Exercise 4.12.2. (Go to Solution)

X ~______________

Exercise 4.12.3. (Go to Solution)

What values does X take on?

Exercise 4.12.4.

For the given values of X , fill in the corresponding probabilities.

Table 4.13.
x P(X=x)
0
4
8
10
11
15

Exercise 4.12.5. (Go to Solution)

Is it likely that there will be no teens killed in the U.S. in teen-driven autos on any given day? Numerically, why?

Exercise 4.12.6. (Go to Solution)

Is it likely that there will be more than 20 teens killed in the U.S. in teen-driven autos on any given day? Numerically, why?

P(10)

0,1,2,3,4,…

No

No

# 4.13. Practice 4: Geometric Distribution*

## Student Learning Objectives

• The student will investigate the properties of a geometric distribution.

## Given:

Use the information from the Binomial Distribution Practice. Suppose that you will randomly select one freshman from the study until you find one who expects to work full-time while in college. You are interested in the number of freshmen you must ask.

## Interpret the Data

Exercise 4.13.1.

In words, define the Random Variable X .

Exercise 4.13.2. (Go to Solution)

X  ~

Exercise 4.13.3. (Go to Solution)

What values does X take on?

Exercise 4.13.4.

Construct the probability distribution function (PDF) for X . Stop at X = 6.

Table 4.14.
x P(X=x)
0
1
2
3
4
5
6

Exercise 4.13.5. (Go to Solution)

On average( μ ), how many freshmen would you expect to have to ask until you found one who expects to work full-time while in college?

Exercise 4.13.6. (Go to Solution)

What is the probability that you will need to ask fewer than 3 freshmen?

Exercise 4.13.7.

Construct a histogram or plot a line graph. Label the horizontal and vertical axes with words. Include numerical scaling. G(0.367)

0,1,2,…

2.72

0.5993

# 4.14. Practice 5: Hypergeometric Distribution*

## Student Learning Objectives

• The student will investigate the properties of a hypergeometric distribution.

## Given

Suppose that a group of statistics students is divided into two groups: business majors and non-business majors. There are 16 business majors in the group and 7 non-business majors in the group. A random sample of 9 students is taken. We are interested in the number of business majors in the group.

## Interpret the Data

Exercise 4.14.1.

In words, define the Random Variable X .

Exercise 4.14.2. (Go to Solution)

X  ~

Exercise 4.14.3. (Go to Solution)

What values does X take on?

Exercise 4.14.4.

Construct the probability distribution function (PDF) for X .

Table 4.15.
x P(X=x)

Exercise 4.14.5. (Go to Solution)

On average( μ ), how many would you expect to be business majors?

H(16,7,9)

2,3,4,5,6,7,8,9

6.26

# 4.15. Homework*

Exercise 4.15.1. (Go to Solution)

1. Complete the PDF and answer the questions.

 x P ( X = x ) x ⋅ P ( X = x ) 0 0.3 1 0.2 2 3 0.4

 a. Find the probability that X = 2. b. Find the expected value.

Exercise 4.15.2.

Suppose that you are offered the following “deal.” You roll a die. If you roll a 6, you win \$10. If you roll a 4 or 5, you win \$5. If you roll a 1, 2, or 3, you pay \$6.

 a. What are you ultimately interested in here (the value of the roll or the money you win)? b. In words, define the Random Variable X . c. List the values that X may take on. d. Construct a PDF. e. Over the long run of playing this game, what are your expected average winnings per game? f. Based on numerical values, should you take the deal? Explain your decision in complete sentences.

Exercise 4.15.3. (Go to Solution)

A venture capitalist, willing to invest \$1,000,000, has three investments to choose from. The first investment, a software company, has a 10% chance of returning \$5,000,000 profit, a 30% chance of returning \$1,000,000 profit, and a 60% chance of losing the million dollars. The second company, a hardware company, has a 20% chance of returning \$3,000,000 profit, a 40% chance of returning \$1,000,000 profit, and a 40% chance of losing the million dollars. The third company, a biotech firm, has a 10% chance of returning \$6,000,000 profit, a 70% of no profit or loss, and a 20% chance of losing the million dollars.

 a. Construct a PDF for each investment. b. Find the expected value for each investment. c. Which is the safest investment? Why do you think so? d. Which is the riskiest investment? Why do you think so? e. Which investment has the highest expected return, on average?

Exercise 4.15.4.

A theater group holds a fund-raiser. It sells 100 raffle tickets for \$5 apiece. Suppose you purchase 4 tickets. The prize is 2 passes to a Broadway show, worth a total of \$150.

 a. What are you interested in here? b. In words, define the Random Variable X . c. List the values that X may take on. d. Construct a PDF. e. If this fund-raiser is repeated often and you always purchase 4 tickets, what would be your expected average winnings per game?

Exercise 4.15.5. (Go to Solution)

Suppose that 20,000 married adults in the United States were randomly surveyed as to the number of children they have. The results are compiled and are used as theoretical probabilities. Let X = the number of children

 x P ( X = x ) x ⋅ P ( X = x ) 0 0.10 1 0.20 2 0.30 3 4 0.10 5 0.05 6 (or more) 0.05

 a. Find the probability that a married adult has 3 children. b. In words, what does the expected value in this example represent? c. Find the expected value. d. Is it more likely that a married adult will have 2 – 3 children or 4 – 6 children? How do you know?

Exercise 4.15.6.

Suppose that the PDF for the number of years it takes to earn a Bachelor of Science (B.S.) degree is given below.

 x P ( X = x ) 3 0.05 4 0.40 5 0.30 6 0.15 7 0.10

 a. In words, define the Random Variable X . b. What does it mean that the values 0, 1, and 2 are not included for X on the PDF? c. On average, how many years do you expect it to take for an individual to earn a B.S.?

## For each problem:

 a. In words, define the Random Variable X . b. List the values that X may take on. c. Give the distribution of X . X ~

Then, answer the questions specific to each individual problem.

Exercise 4.15.7. (Go to Solution)

Six different colored dice are rolled. Of interest is the number of dice that show a “1.”

 d. On average, how many dice would you expect to show a “1”? e. Find the probability that all six dice show a “1.” f. Is it more likely that 3 or that 4 dice will show a “1”? Use numbers to justify your answer numerically.

Exercise 4.15.8.

According to a 2003 publication by Waits and Lewis (source: http://nces.ed.gov/pubs2003/2003017.pdf ), by the end of 2002, 92% of U.S. public two-year colleges offered distance learning courses. Suppose you randomly pick 13 U.S. public two-year colleges. We are interested in the number that offer distance learning courses.

 d. On average, how many schools would you expect to offer such courses? e. Find the probability that at most 6 offer such courses. f. Is it more likely that 0 or that 13 will offer such courses? Use numbers to justify your answer numerically and answer in a complete sentence.

Exercise 4.15.9. (Go to Solution)

A school newspaper reporter decides to randomly survey 12 students to see if they will attend Tet festivities this year. Based on past years, she knows that 18% of students attend Tet festivities. We are interested in the number of students who will attend the festivities.

 d. How many of the 12 students do we expect to attend the festivities? e. Find the probability that at most 4 students will attend. f. Find the probability that more than 2 students will attend.

Exercise 4.15.10.

 d. How many are expected to attend their graduation? e. Find the probability that 17 or 18 attend. f. Based on numerical values, would you be surprised if all 22 attended graduation? Justify your answer numerically.

Exercise 4.15.11. (Go to Solution)

At The Fencing Center, 60% of the fencers use the foil as their main weapon. We randomly survey 25 fencers at The Fencing Center. We are interested in the numbers that do not use the foil as their main weapon.

 d. How many are expected to not use the foil as their main weapon? e. Find the probability that six do not use the foil as their main weapon. f. Based on numerical values, would you be surprised if all 25 did not use foil as their main weapon? Justify your answer numerically.

Exercise 4.15.12.

Approximately 8% of students at a local high school participate in after-school sports all four years of high school. A group of 60 seniors is randomly chosen. Of interest is the number that participated in after-school sports all four years of high school.

 d. How many seniors are expected to have participated in after-school sports all four years of high school? e. Based on numerical values, would you be surprised if none of the seniors participated in after-school sports all four years of high school? Justify your answer numerically. f. Based upon numerical values, is it more likely that 4 or that 5 of the seniors participated in after-school sports all four years of high school? Justify your answer numerically.

Exercise 4.15.13. (Go to Solution)

The chance of having an extra fortune in a fortune cookie is about 3%. Given a bag of 144 fortune cookies, we are interested in the number of cookies with an extra fortune. Two distributions may be used to solve this problem. Use one distribution to solve the problem.

 d. How many cookies do we expect to have an extra fortune? e. Find the probability that none of the cookies have an extra fortune. f. Find the probability that more than 3 have an extra fortune. g. As n increases, what happens involving the probabilities using the two distributions? Explain in complete sentences.

Exercise 4.15.14.

There are two games played for Chinese New Year and Vietnamese New Year. They are almost identical. In the Chinese version, fair dice with numbers 1, 2, 3, 4, 5, and 6 are used, along with a board with those numbers. In the Vietnamese version, fair dice with pictures of a gourd, fish, rooster, crab, crayfish, and deer are used. The board has those six objects on it, also. We will play with bets being \$1. The player places a bet on a number or object. The “house” rolls three dice. If none of the dice show the number or object that was bet, the house keeps the \$1 bet. If one of the dice shows the number or object bet (and the other two do not show it), the player gets back his \$1 bet, plus \$1 profit. If two of the dice show the number or object bet (and the third die does not show it), the player gets back his \$1 bet, plus \$2 profit. If all three dice show the number or object bet, the player gets back his \$1 bet, plus \$3 profit.

Let X = number of matches and Y = profit per game.

 d. List the values that Y may take on. Then, construct one PDF table that includes both X & Y and their probabilities. e. Calculate the average expected matches over the long run of playing this game for the player. f. Calculate the average expected earnings over the long run of playing this game for the player. g. Determine who has the advantage, the player or the house.

Exercise 4.15.15. (Go to Solution)

According to the South Carolina Department of Mental Health web site, for every 200 U.S. women, the average number who suffer from anorexia is one ( http://www.state.sc.us/dmh/anorexia/statistics.htm ). Out of a randomly chosen group of 600 U.S. women:

 d. How many are expected to suffer from anorexia? e. Find the probability that no one suffers from anorexia. f. Find the probability that more than four suffer from anorexia.

Exercise 4.15.16.

The average number of children of middle-aged Japanese couples is 2.09 (Source: The Yomiuri Shimbun, June 28, 2006). Suppose that one middle-aged Japanese couple is randomly chosen.

 d. Find the probability that they have no children. e. Find the probability that they have fewer children than the Japanese average. f. Find the probability that they have more children than the Japanese average .

Exercise 4.15.17. (Go to Solution)

The average number of children per Spanish couples was 1.34 in 2005. Suppose that one Spanish couple is randomly chosen. (Source: http://www.typicallyspanish.com/news/publish/article_4897.shtml , June 16, 2006).

 d. Find the probability that they have no children. e. Find the probability that they have fewer children than the Spanish average. f. Find the probability that they have more children than the Spanish average .

Exercise 4.15.18.

Fertile (female) cats produce an average of 3 litters per year. (Source: The Humane Society of the United States). Suppose that one fertile, female cat is randomly chosen. In one year, find the probability she produces:

 d. No litters. e. At least 2 litters. f. Exactly 3 litters.

Exercise 4.15.19. (Go to Solution)

A consumer looking to buy a used red Miata car will call dealerships until she finds a dealership that carries the car. She estimates the probability that any independent dealership will have the car will be 28%. We are interested in the number of dealerships she must call.

 d. On average, how many dealerships would we expect her to have to call until she finds one that has the car? e. Find the probability that she must call at most 4 dealerships. f. Find the probability that she must call 3 or 4 dealerships.

Exercise 4.15.20.

Suppose that the probability that an adult in America will watch the Super Bowl is 40%. Each person is considered independent. We are interested in the number of adults in America we must survey until we find one who will watch the Super Bowl.

 d. How many adults in America do you expect to survey until you find one who will watch the Super Bowl? e. Find the probability that you must ask 7 people. f. Find the probability that you must ask 3 or 4 people.

Exercise 4.15.21. (Go to Solution)

A group of Martial Arts students is planning on participating in an upcoming demonstration. 6 are students of Tae Kwon Do; 7 are students of Shotokan Karate. Suppose that 8 students are randomly picked to be in the first demonstration. We are interested in the number of Shotokan Karate students in that first demonstration. Hint: Use the Hypergeometric distribution. Look in the Formulas section of 4: Discrete Distributions and in the Appendix Formulas.

 d. How many Shotokan Karate students do we expect to be in that first demonstration? e. Find the probability that 4 students of Shotokan Karate are picked for the first demonstration. f. Suppose that we are interested in the Tae Kwan Do students that are picked for the first demonstration. Find the probability that all 6 students of Tae Kwan Do are picked for the first demonstration.

Exercise 4.15.22.

The chance of a IRS audit for a tax return with over \$25,000 in income is about 2% per year. We are interested in the expected number of audits a person with that income has in a 20 year period. Assume each year is independent.

 d. How many audits are expected in a 20 year period? e. Find the probability that a person is not audited at all. f. Find the probability that a person is audited more than twice.

Exercise 4.15.23. (Go to Solution)

Refer to the previous problem. Suppose that 100 people with tax returns over \$25,000 are randomly picked. We are interested in the number of people audited in 1 year. One way to solve this problem is by using the Binomial Distribution. Since n is large and p is small, another discrete distribution could be used to solve the following problems. Solve the following questions (d-f) using that distribution.

 d. How many are expected to be audited? e. Find the probability that no one was audited. f. Find the probability that more than 2 were audited.

Exercise 4.15.24.

Suppose that a technology task force is being formed to study technology awareness among instructors. Assume that 10 people will be randomly chosen to be on the committee from a group of 28 volunteers, 20 who are technically proficient and 8 who are not. We are interested in the number on the committee who are not technically proficient.

 d. How many instructors do you expect on the committee who are not technically proficient? e. Find the probability that at least 5 on the committee are not technically proficient. f. Find the probability that at most 3 on the committee are not technically proficient.

Exercise 4.15.25. (Go to Solution)

Refer back to Exercise 4.15.12. Solve this problem again, using a different, though still acceptable, distribution.

Exercise 4.15.26.

Suppose that 9 Massachusetts athletes are scheduled to appear at a charity benefit. The 9 are randomly chosen from 8 volunteers from the Boston Celtics and 4 volunteers from the New England Patriots. We are interested in the number of Patriots picked.

 d. Is it more likely that there will be 2 Patriots or 3 Patriots picked? e. What is the probability that all of the volunteers will be from the Celtics f. Is it more likely that more of the volunteers will be from the Patriots or from the Celtics? How do you know?

Exercise 4.15.27. (Go to Solution)

On average, Pierre, an amateur chef, drops 3 pieces of egg shell into every 2 batters of cake he makes. Suppose that you buy one of his cakes.

 d. On average, how many pieces of egg shell do you expect to be in the cake? e. What is the probability that there will not be any pieces of egg shell in the cake? f. Let’s say that you buy one of Pierre’s cakes each week for 6 weeks. What is the probability that there will not be any egg shell in any of the cakes? g. Based upon the average given for Pierre, is it possible for there to be 7 pieces of shell in the cake? Why?

Exercise 4.15.28.

It has been estimated that only about 30% of California residents have adequate earthquake supplies. Suppose we are interested in the number of California residents we must survey until we find a resident who does not have adequate earthquake supplies.

 d. What is the probability that we must survey just 1 or 2 residents until we find a California resident who does not have adequate earthquake supplies? e. What is the probability that we must survey at least 3 California residents until we find a California resident who does not have adequate earthquake supplies? f. How many California residents do you expect to need to survey until you find a California resident who does not have adequate earthquake supplies? g. How many California residents do you expect to need to survey until you find a California resident who does have adequate earthquake supplies?

Exercise 4.15.29. (Go to Solution)

Refer to the above problem. Suppose you randomly survey 11 California residents. We are interested in the number who have adequate earthquake supplies.

 d. What is the probability that at least 8 have adequate earthquake supplies? e. Is it more likely that none or that all of the residents surveyed will have adequate earthquake supplies? Why? f. How many residents do you expect will have adequate earthquake supplies?

The next 3 questions refer to the following: In one of its Spring catalogs, L.L. Bean® advertised footwear on 29 of its 192 catalog pages.

Exercise 4.15.30.

Suppose we randomly survey 20 pages. We are interested in the number of pages that advertise footwear. Each page may be picked at most once.

 d. How many pages do you expect to advertise footwear on them? e. Is it probable that all 20 will advertise footwear on them? Why or why not? f. What is the probability that less than 10 will advertise footwear on them?

Exercise 4.15.31. (Go to Solution)

Suppose we randomly survey 20 pages. We are interested in the number of pages that advertise footwear. This time, each page may be picked more than once.

 d. How many pages do you expect to advertise footwear on them? e. Is it probable that all 20 will advertise footwear on them? Why or why not? f. What is the probability that less than 10 will advertise footwear on them? g. Suppose that a page may be picked more than once. We are interested in the number of pages that we must randomly survey until we find one that has footwear advertised on it. Define the random variable X and give its distribution. h. Do you expect to survey more than 10 pages in order to find one that advertises footwear on it? Why? i. What is the probability that you only need to survey at most 3 pages in order to find one that advertises footwear on it? j. How many pages do you expect to need to survey in order to find one that advertises footwear?

Exercise 4.15.32.

Suppose that you roll a fair die until each face has appeared at least once. It does not matter in what order the numbers appear. Find the expected number of rolls you must make until each face has appeared at least once.

## Try these multiple choice problems.

For the next three problems: The probability that the San Jose Sharks will win any given game is 0.3694 based on their 13 year win history of 382 wins out of 1034 games played (as of a certain date). Their 2005 schedule for November contains 12 games. Let X = number of games won in November 2005

Exercise 4.15.33. (Go to Solution)

The expected number of wins for the month of November 2005 is:

 A. 1.67 B. 12 C. D. 4.43

Exercise 4.15.34. (Go to Solution)

What is the probability that the San Jose Sharks win 6 games in November?

 A. 0.1476 B. 0.2336 C. 0.7664 D. 0.8903

Exercise 4.15.35. (Go to Solution)

Find the probability that the San Jose Sharks win at least 5 games in November.

 A. 0.3694 B. 0.5266 C. 0.4734 D. 0.2305

For the next two questions: The average number of times per week that Mrs. Plum’s cats wake her up at night because they want to play is 10. We are interested in the number of times her cats wake her up each week.

Exercise 4.15.36. (Go to Solution)

In words, the random variable X  =

 A. The number of times Mrs. Plum’s cats wake her up each week B. The number of times Mrs. Plum’s cats wake her up each hour C. The number of times Mrs. Plum’s cats wake her up each night D. The number of times Mrs. Plum’s cats wake her up

Exercise 4.15.37. (Go to Solution)

Find the probability that her cats will wake her up no more than 5 times next week.

 A. 0.5000 B. 0.9329 C. 0.0378 D. 0.0671

Exercise 4.15.38. (Go to Solution)

People visiting video rental stores often rent more than one DVD at a time. The probability distribution for DVD rentals per customer at Video To Go is given below. There is 5 video limit per customer at this store, so nobody ever rents more than 5 DVDs.

 X 0 1 2 3 4 5 P(X) 0.03 0.5 0.24 ? 0.07 0.04
 A. Describe the random variable X in words. B. Find the probability that a customer rents three DVDs. C. Find the probability that a customer rents at least 4 DVDs. Write your answer using proper notation. D. Find the probability that a customer rents at most 2 DVDs. Write your answer using proper notation.

Another shop, Entertainment Headquarters, rents DVDs and videogames. The probability distribution for DVD rentals per customer at this shop is given below. They also have a 5 DVD limit per customer.

 X) 0 1 2 3 4 5 P(X) 0.35 0.25 0.2 0.1 0.05 0.05
 E. At which store is the expected number of DVDs rented per customer higher? F. If Video to Go estimates that they will have 300 customers next week, how many DVDs do they expect to rent next week? Answer in sentence form. G. If Video to Go expects 300 customers next week and Entertainment HQ projects that they will have 420 customers, for which store is the expected number of DVD rentals for next week higher? Explain. H. Which of the two video stores experiences more variation in the number of DVD rentals per customer? How do you know that?

Exercise 4.15.39. (Go to Solution)

A game involves selecting a card from a deck of cards and tossing a coin. The deck has 52 cards and 12 cards are “face cards” (Jack, Queen, or King) The coin is a fair coin and is equally likely to land on Heads or Tails

• If the card is a face card and the coin lands on Heads, you win \$6

• If the card is a face card and the coin lands on Tails, you win \$2

• If the card is not a face card, you lose \$2, no matter what the coin shows.

 A. Find the expected value for this game (expected net gain or loss). B. Explain what your calculations indicate about your long-term average profits and losses on this game. C. Should you play this game to win money?

Exercise 4.15.40. (Go to Solution)

You buy a lottery ticket to a lottery that costs \$10 per ticket. There are only 100 tickets available be sold in this lottery. In this lottery there is one \$500 prize, 2 \$100 prizes and 4 \$25 prizes. Find your expected gain or loss.

Exercise 4.15.41. (Go to Solution)

A student takes a 10 question true-false quiz, but did not study and randomly guesses each answer. Find the probability that the student passes the quiz with a grade of at least 70% of the questions correct.

Exercise 4.15.42. (Go to Solution)

A student takes a 32 question multiple choice exam, but did not study and randomly guesses each answer. Each question has 3 possible choices for the answer. Find the probability that the student guesses more than 75% of the questions correctly.

Exercise 4.15.43. (Go to Solution)

Suppose that you are perfoming the probability experiment of rolling one die. Let F be the event of rolling a “4” or a “5”. You are interested in how many times you need to roll the die in order to obtain the first “4 or 5” as the outcome.

• p = probability of success (event F occurs)

• q = probability of failure (event F does not occur)

 A. Write the description of the random variable X. What are the values that X can take on? Find the values of p and q. What is the appropriate probability distribution for X? B. Find the probability that the first occurrence of event F (“4” or “5”) is on the first or second trial. C. Find the probability that more than 4 trials are needed to obtain the first “4” or “5” when rolling the die.

**Exercises 38 - 43 contributed by Roberta Bloom

## Solutions to Exercises

 a. 0.1 b. 1.6

 b. \$200,000;\$600,000;\$400,000 c. third investment d. first investment e. second investment

 a. 0.2 c. 2.35 d. 2-3 children

 a. X = the number of dice that show a 1 b. 0,1,2,3,4,5,6 c. X ~ d. 1 e. 0.00002 f. 3 dice

 a. X = the number of students that will attend Tet. b. 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 c. X ~B(12,0.18) d. 2.16 e. 0.9511 f. 0.3702

 a. X = the number of fencers that do not use foil as their main weapon b. 0, 1, 2, 3,… 25 c. X ~B(25,0.40) d. 10 e. 0.0442 f. Yes

 a. X = the number of fortune cookies that have an extra fortune b. 0, 1, 2, 3,… 144 c. X ~B(144, 0.03) or P(4.32) d. 4.32 e. 0.0124 or 0.0133 f. 0.6300 or 0.6264

 a. X = the number of women that suffer from anorexia b. 0, 1, 2, 3,… 600 (can leave off 600) c. X ~P(3) d. 3 e. 0.0498 f. 0.1847

 a. X = the number of children for a Spanish couple b. 0, 1, 2, 3,… c. X ~P(1.34) d. 0.2618 e. 0.6127 f. 0.3873

 a. X = the number of dealers she calls until she finds one with a used red Miata b. 0, 1, 2, 3,… c. X ~G(0.28) d. 3.57 e. 0.7313 f. 0.2497

 d. 4.31 e. 0.4079 f. 0.0163

 d. 2 e. 0.1353 f. 0.3233

 a. X = the number of seniors that participated in after-school sports all 4 years of high school b. 0, 1, 2, 3,… 60 c. X ~ P ( 4 . 8 ) d. 4.8 e. Yes f. 4

 a. X = the number of shell pieces in one cake b. 0, 1, 2, 3,… c. X ~ P ( 1 . 5 ) d. 1.5 e. 0.2231 f. 0.0001 g. Yes

 d. 0.0043 e. none f. 3.3

 d. 3.02 e. No f. 0.9997 h. 0.2291 i. 0.3881 j. 6.6207 pages

D: 4.43

A: 0.1476

C: 0.4734

A: The number of times Mrs. Plum’s cats wake her up each week

D: 0.0671

Contact your instructor.

The variable of interest is X = net gain or loss, in dollars

The face cards J, Q, K (Jack, Queen, King). There are(3)(4) = 12 face cards and 52 – 12 = 40 cards that are not face cards.

We first need to construct the probability distribution for X. We use the card and coin events to determine the probability for each outcome, but we use the monetary value of X to determine the expected value.

 Card Event \$X net gain or loss P(X) Face Card and Heads 6 (12/52)(1/2) = 6/52 Face Card and Tails 2 (12/52)(1/2) = 6/52 (Not Face Card) and (H or T) –2 (40/52)(1) = 40/52
• Expected value = (6)(6/52) + (2)(6/52) + (–2) (40/52) = –32/52

• Expected value = –\$0.62, rounded to the nearest cent

• If you play this game repeatedly, over a long number of games, you would expect to lost 62 cents per game, on average.

• You should not play this game to win money because the expected value indicates an expected average loss.

Start by writing the probability distribution. X is net gain or loss = prize (if any) less \$10 cost of ticket

 X = \$ net gain or loss P(X) \$500–\$10=\$490 1/100 \$100–\$10=\$90 2/100 \$25–\$10=\$15 4/100 \$0–\$10=\$–10 93/100)

Expected Value = (490)(1/100) + (90)(2/100) + (15)(4/100) + (–10) (93/100) = –\$2. There is an expected loss of \$2 per ticket, on average.

• X = number of questions answered correctly

• X~B(10, 0.5)

• We are interested in AT LEAST 70% of 10 questions correct. 70% of 10 is 7. We want to find the probability that X is greater than or equal to 7. The event “at least 7” is the complement of “less than or equal to 6”.

• Using your calculator’s distribution menu: 1 – binomcdf(10, .5, 6) gives 0.171875

• The probability of getting at least 70% of the 10 questions correct when randomly guessing is approximately 0.172

• X = number of questions answered correctly

• X~B(32, 1/3)

• We are interested in MORE THAN 75% of 32 questions correct. 75% of 32 is 24. We want to find P(X>24). The event “more than 24” is the complement of “less than or equal to 24”.

• P(X>24) = 0.00000026761

• The probability of getting more than 75% of the 32 questions correct when randomly guessing is very small and practically zero.

Contact your instructor.

# 4.16. Review*

The next two questions refer to the following:

A recent poll concerning credit cards found that 35 percent of respondents use a credit card that gives them a mile of air travel for every dollar they charge. Thirty percent of the respondents charge more than \$2000 per month. Of those respondents who charge more than \$2000, 80 percent use a credit card that gives them a mile of air travel for every dollar they charge.

Exercise 4.16.1. (Go to Solution)

What is the probability that a randomly selected respondent expected to spend more than \$2000 AND use a credit card that gives them a mile of air travel for every dollar they charge?

 A. ( 0 . 30 ) ( 0 . 35 ) B. ( 0 . 80 ) ( 0 . 35 ) C. ( 0 . 80 ) ( 0 . 30 ) D. ( 0 . 80 )

Exercise 4.16.2. (Go to Solution)

Based upon the above information, are using a credit card that gives a mile of air travel for each dollar spent AND charging more than \$2000 per month independent events?

 A. Yes B. No, and they are not mutually exclusive either C. No, but they are mutually exclusive D. Not enough information given to determine the answer

Exercise 4.16.3. (Go to Solution)

A sociologist wants to know the opinions of employed adult women about government funding for day care. She obtains a list of 520 members of a local business and professional women’s club and mails a questionnaire to 100 of these women selected at random. 68 questionnaires are returned. What is the population in this study?

 A. All employed adult women B. All the members of a local business and professional women’s club C. The 100 women who received the questionnaire D. All employed women with children

The next two questions refer to the following: An article from The San Jose Mercury News was concerned with the racial mix of the 1500 students at Prospect High School in Saratoga, CA. The table summarizes the results. (Male and female values are approximate.)

Table 4.23.
Ethnic Group
GenderWhiteAsianHispanicBlackAmerican Indian
Male4001681153516
Female4401321404014

Exercise 4.16.4. (Go to Solution)

Find the probability that a student is Asian or Male.

Exercise 4.16.5. (Go to Solution)

Find the probability that a student is Black given that the student is Female.

Exercise 4.16.6. (Go to Solution)

A sample of pounds lost, in a certain month, by individual members of a weight reducing clinic produced the following statistics:

• Mean = 5 lbs.

• Median = 4.5 lbs.

• Mode = 4 lbs.

• Standard deviation = 3.8 lbs.

• First quartile = 2 lbs.

• Third quartile = 8.5 lbs.

The correct statement is:

 A. One fourth of the members lost exactly 2 pounds. B. The middle fifty percent of the members lost from 2 to 8.5 lbs. C. Most people lost 3.5 to 4.5 lbs. D. All of the choices above are correct.

Exercise 4.16.7. (Go to Solution)

What does it mean when a data set has a standard deviation equal to zero?

 A. All values of the data appear with the same frequency. B. The mean of the data is also zero. C. All of the data have the same value. D. There are no data to begin with.

Exercise 4.16.8. (Go to Solution)

The statement that best describes the illustration below is:

Figure 4.1. A. The mean is equal to the median. B. There is no first quartile. C. The lowest data value is the median. D. The median equals Exercise 4.16.9. (Go to Solution)

According to a recent article (San Jose Mercury News) the average number of babies born with significant hearing loss (deafness) is approximately 2 per 1000 babies in a healthy baby nursery. The number climbs to an average of 30 per 1000 babies in an intensive care nursery.

Suppose that 1000 babies from healthy nursery babies were surveyed. Find the probability that exactly 2 babies were born deaf.

Exercise 4.16.10. (Go to Solution)

A “friend” offers you the following “deal.” For a \$10 fee, you may pick an envelope from a box containing 100 seemingly identical envelopes. However, each envelope contains a coupon for a free gift.

• 10 of the coupons are for a free gift worth \$6.

• 80 of the coupons are for a free gift worth \$8.

• 6 of the coupons are for a free gift worth \$12.

• 4 of the coupons are for a free gift worth \$40.

Based upon the financial gain or loss over the long run, should you play the game?

 A. Yes, I expect to come out ahead in money. B. No, I expect to come out behind in money. C. It doesn’t matter. I expect to break even.

The next four questions refer to the following: Recently, a nurse commented that when a patient calls the medical advice line claiming to have the flu, the chance that he/she truly has the flu (and not just a nasty cold) is only about 4%. Of the next 25 patients calling in claiming to have the flu, we are interested in how many actually have the flu.

Exercise 4.16.11. (Go to Solution)

Define the Random Variable and list its possible values.

Exercise 4.16.12. (Go to Solution)

State the distribution of X  .

Exercise 4.16.13. (Go to Solution)

Find the probability that at least 4 of the 25 patients actually have the flu.

Exercise 4.16.14. (Go to Solution)

On average, for every 25 patients calling in, how many do you expect to have the flu?

The next two questions refer to the following: Different types of writing can sometimes be distinguished by the number of letters in the words used. A student interested in this fact wants to study the number of letters of words used by Tom Clancy in his novels. She opens a Clancy novel at random and records the number of letters of the first 250 words on the page.

Exercise 4.16.15. (Go to Solution)

What kind of data was collected?

 A. qualitative B. quantitative - continuous C. quantitative – discrete

Exercise 4.16.16. (Go to Solution)

What is the population under study?

## Solutions to Exercises

C

B

A

0.5773

0.0522

B

C

C

0.2709

B

X = the number of patients calling in claiming to have the flu, who actually have the flu. X = 0, 1, 2, …25

B ( 25 , 0 . 04 )

0.0165

1

C

All words used by Tom Clancy in his novels

# 4.17. Lab 1: Discrete Distribution (Playing Card Experiment)*

Class Time:

Names:

## Student Learning Outcomes:

• The student will compare empirical data and a theoretical distribution to determine if everyday experiment fits a discrete distribution.

• The student will demonstrate an understanding of long-term probabilities.

## Supplies:

• One full deck of playing cards

## Procedure

The experiment procedure is to pick one card from a deck of shuffled cards.

1. The theorectical probability of picking a diamond from a deck is: _________

2. Shuffle a deck of cards.

3. Pick one card from it.

4. Record whether it was a diamond or not a diamond.

5. Put the card back and reshuffle.

6. Do this a total of 10 times

7. Record the number of diamonds picked.

8. Let X = number of diamonds. Theoretically, X ~ B(_____,_____)

## Organize the Data

1. Record the number of diamonds picked for your class in the chart below. Then calculate the relative frequency.

Table 4.24.
XFrequencyRelative Frequency
0____________________
1____________________
2____________________
3____________________
4____________________
5____________________
6____________________
7____________________
8____________________
9____________________
10____________________

2. Calculate the following:

 a. = b. s =

3. Construct a histogram of the empirical data.

Figure 4.2. ## Theoretical Distribution

1. Build the theoretical PDF chart for X based on the distribution in the Procedure section above.

 x P ( X = x ) 0 1 2 3 4 5 6 7 8 9 10

2. Calculate the following:

 a. μ = ____________ b. σ = ____________

3. Construct a histogram of the theoretical distribution.

Figure 4.3. ## Using the Data

Calculate the following, rounding to 4 decimal places:

### Note

RF = relative frequency

Use the table from the section titled “Theoretical Distribution” here:

• P ( X = 3 ) =

• P ( 1 < X < 4 ) =

• P ( X ≥ 8 ) =

Use the data from the section titled “Organize the Data” here:

• RF ( X = 3 ) =

• RF ( 1 < X < 4 ) =

• RF ( X ≥ 8 ) =

## Discussion Questions

For questions 1. and 2., think about the shapes of the two graphs, the probabilities and the relative frequencies, the means, and the standard deviations.

1. Knowing that data vary, describe three similarities between the graphs and distributions of the theoretical and empirical distributions. Use complete sentences. (Note: These answers may vary and still be correct.)

2. Describe the three most significant differences between the graphs or distributions of the theoretical and empirical distributions. (Note: These answers may vary and still be correct.)

3. Using your answers from the two previous questions, does it appear that the data fit the theoretical distribution? In 1 - 3 complete sentences, explain why or why not.

4. Suppose that the experiment had been repeated 500 times. Which table (from “Organize the data” and “Theoretical Distributions”) would you expect to change (and how would it change)? Why? Why wouldn’t the other table change?

# 4.18. Lab 2: Discrete Distribution (Lucky Dice Experiment)*

Class Time:

Names:

## Student Learning Outcomes:

• The student will compare empirical data and a theoretical distribution to determine if a Tet gambling game fits a discrete distribution.

• The student will demonstrate an understanding of long-term probabilities.

## Supplies:

• 1 game “Lucky Dice” or 3 regular dice

### Note

For a detailed game description, refer here. (The link goes to the beginning of Discrete Random Variables Homework. Please refer to Problem #14.)

### Note

Round relative frequencies and probabilities to four decimal places.

## The Procedure

1. The experiment procedure is to bet on one object. Then, roll 3 Lucky Dice and count the number of matches. The number of matches will decide your profit.

2. What is the theoretical probability of 1 die matching the object? _________

3. Choose one object to place a bet on. Roll the 3 Lucky Dice. Count the number of matches.

4. Let X = number of matches. Theoretically, X ~ B(______,______)

5. Let Y = profit per game.

## Organize the Data

In the chart below, fill in the Y value that corresponds to each X value. Next, record the number of matches picked for your class. Then, calculate the relative frequency.

1. Complete the table.

Table 4.26.
xyFrequencyRelative Frequency
0
1
2
3

2. Calculate the Following:

 a. b. s x = c. d. s y =

3. Explain what represents.

4. Explain what represents.

5. Based upon the experiment:

 a. What was the average profit per game? b. Did this represent an average win or loss per game? c. How do you know? Answer in complete sentences.

6. Construct a histogram of the empirical data

Figure 4.4. ## Theoretical Distribution

Build the theoretical PDF chart for X and Y based on the distribution from the section titled “The Procedure”.

1. Table 4.27.
x y P ( X = x ) = P ( Y = y )
0
1
2
3

2. Calculate the following

 a. μ x = b. σ x = c. μ y =

3. Explain what μ x  represents.

4. Explain what μ y  represents.

5. Based upon theory:

 a. What was the expected profit per game? b. Did the expected profit represent an average win or loss per game? c. How do you know? Answer in complete sentences.

6. Construct a histogram of the theoretical distribution.

Figure 4.5. ## Use the Data

Calculate the following (rounded to 4 decimal places):

### Note

RF = relative frequency

Use the data from the section titled “Theoretical Distribution” here:

1. P ( X = 3 ) = ____________

2. P ( 0 < X < 3 ) = ____________

3. P ( X ≥ 2 ) = ____________

Use the data from the section titled “Organize the Data” here:

1. ____________

2. ____________

3. ____________

## Discussion Question

For questions 1. and 2., consider the graphs, the probabilities and relative frequencies, the means and the standard deviations.

1. Knowing that data vary, describe three similarities between the graphs and distributions of the theoretical and empirical distributions. Use complete sentences. (Note: these answers may vary and still be correct.)

2. Describe the three most significant differences between the graphs or distributions of the theoretical and empirical distributions. (Note: these answers may vary and still be correct.)

3. Thinking about your answers to 1. and 2.,does it appear that the data fit the theoretical distribution? In 1 - 3 complete sentences, explain why or why not.

4. Suppose that the experiment had been repeated 500 times. Which table (from “Organize the Data” or “Theoretical Distribution”) would you expect to change? Why? How might the table change?