Monday, June 2, 2014

Distributions Related To the Binomial Distribution

This is one of the following four articles on the Binomial Distribution in Excel

Overview of the Binomial Distribution in Excel 2010 and Excel 2013

Solving Problems With the Binomial Distribution in Excel 2010 and Excel 2013

Normal Approximation of the Binomial Distribution in Excel 2010 and Excel 2013

Distributions Related to the Binomial Distribution

Distributions Related To

the Binomial Distribution

Binomial Distribution

– Calculates the probability of k successes in n Bernoulli trials with replacement. Sampling with replacement means that each sample is placed back into the population before another sample is taken.

Hypergeometric Distribution

- Calculates the probability of k successes in n Bernoulli trials without replacement. Sampling without replacement means that each sample taken is not placed back into the population. This is not a binomial experiment because a binomial experiment requires that the probability of success be constant on every trial.

Multinomial distribution

– This is a generalization of the binomial distribution. Each multinomial trial has two or more possible outcomes. A binomial trial is a multinomial trial with only two possible outcomes. The multinomial distribution calculates the probability of a specific combination of outcomes occurring in n multinomial trials. Sampling is done with replacement during a multinomial experiment.

Negative Binomial Distribution

– Calculates probability of a specific number of successes occurring before a specified number of failures. Each trial is a Bernoulli trial. Sampling is done with replacement during a negative binomial experiment. Like the binomial distribution, the probability of success on every trial is the same and the trials are independent, i.e., the outcome of one trial does not affect the outcome of any other trial.

Geometric Distribution

– Calculates the probability that a success will occur for the first time on the nth Bernoulli trial. This is a special case of the negative binomial distribution in which the number of successes that will occur before a specified number of failures is equal to one. Sampling is done with replacement during a geometric experiment. Like the binomial distribution, the probability of success on every trial is the same and the trials are independent, i.e., the outcome of one trial does not affect the outcome of any other trial.

Following will be presented more information about each of the distributions just mentioned. Information about each distribution will include its PDF, CDF, Excel formulas, and a brief problem solved with the PDF of that distribution in Excel.

Binomial Distribution

Calculates the probability of k successes in n Bernoulli trials with replacement. Sampling with replacement means that each sample is placed back into the population before another sample is taken.

The binomial distribution’s PDF is given by the following:

(Click On Image To See a Larger Version)

for k = 0, 1, 2, …, n,

where

(Click On Image To See a Larger Version)

is the binomial coefficient and is sometimes referred to as “n choose k.”

f(k;n,p) = Pr(X = k) is the probability that X, the number of successes, equals k for n independent, successive Bernoulli trials each having the probability p of success.

The Excel formula to calculate the binomial distribution’s PDF is the following:

f(k;n,p) = Pr(X = k) = BINOM.DIST(k, n, p , FALSE)

FALSE indicates that this Excel formula is calculating the binomial distribution’s PDF and not the CDF for this k, n, and p. “False” answers the question of the calculation is cumulative (which is the case if calculating the CDF – Cumulative Distribution Function) on not cumulative (which is the case if calculating the PDF – Probability Density Function).

Excel 2010 and later also use the formula BINOM.DIST which is equivalent to BINOMDIST used by earlier versions of Excel.

The binomial distribution’s CDF is as follows:

F(k;n,p) = Pr(X ≤ k). This is the probability that X, the number of successes in n Bernoulli trials each having the probability p of a successful outcome, equals up to k.

The binomial distribution’s CDF is given by the following:

(Click On Image To See a Larger Version)

is the “floor” under k, i.e., the greatest integer less than or equal to k.

The Excel formula to calculate the binomial distribution’s CDF is the following:

F(k;n,p) = Pr(X ≤ k) = BINOM.DIST(k, n, p , TRUE)

A basic problem solved by the binomial distribution’s PDF in Excel is as follows:

What is the probability of getting exactly 4 “fives” if a single die is rolled 7 times?

The probability of getting a five on any roll of a die is one sixth.

p = 1/6 = 0.1667

n = 7 trials (rolls of the die)

k = 4

This requires the PDF to solve for Pr(X = k)

f(k;n,p) = Pr(X = k) = BINOM.DIST(k, n, p , FALSE)

f(4;7,0.1667) = BINOM.DIST(4,7,0.1667,FALSE) = 0.0156

There is a 1.56 percent of getting exactly 4 “fives” if a single die is rolled 7 times.

Hypergeometric Distribution

Calculates the probability of k successes in n Bernoulli trials without replacement. Sampling without replacement means that each sample taken is not placed back into the population. This is not a binomial experiment because a binomial experiment requires that the probability of success be constant on every trial.

The hypergeometric’s distribution’s PDF is given by the following:

k = targeted number of successes in n trials

K = total number of successes in the population

n = number of trials

N = population size

(Click On Image To See a Larger Version)

where

(Click On Image To See a Larger Version)

is the binomial coefficient and is sometimes referred to as “n choose k.”

The Excel formula to calculate the hypergeometric distribution’s PDF is the following:

Pr(X = k) = HYPGEOM.DIST(k, n, K, N, FALSE)

Excel 2010 and later also use the formula HYPGEOM.DIST which is similar but not exactly equivalent to HYPGEOMDIST used by earlier versions of Excel. The earlier version of the formula, HYPGEOMDIST(), calculates only the PDF.

HYPGEOMDIST(k, n, K, N) = HYPGEOM.DIST(k, n, K, N, FALSE)

The hypergeometric distribution’s CDF is as follows:

(Click On Image To See a Larger Version)

Until Excel 2010 there is no Excel formula to calculate the hypergeometric distribution’s CDF. Prior to 2010 the hypergeometric distribution’s CDF has to be calculated in Excel by summing successive PDF calculations as follows:

(Click On Image To See a Larger Version)

The hypergeometric distribution is a discrete distribution whose CDF is the sum of sum of the specific point values that make up the PDF of the hypergeometric distribution because it is a discrete distribution.

In Excel 2010 and beyond, the CDF of hypergeometric distribution can be calculated as follows:

Pr(X≤k) = HYPGEOM.DIST(k, n, K, N, TRUE)

A basic problem solved by the hypergeometric distribution’s PDF in Excel is as follows:

What is the probability of getting two kings in ten successive draws from a deck of 52 cards. There are four kings in a deck of cards. Each card drawn is not replaced back into the deck?

k = 2

K = 4

n = 10

N = 52

Pr(X = k) = HYPGEOM.DIST(k, n, K, N , FALSE)

Pr(X = 2) = HYPGEOM.DIST(2, 10, 4, 52 , FALSE) = 0.1431

There is a 14.31 percent probability of getting two kings in 10 successive draws from a 52-card that initial contains four kings if each card is discarded after being drawn.

Multinomial Distribution

This is a generalization of the binomial distribution. Each multinomial trial has two or more possible outcomes. A binomial trial is a multinomial trial with only two possible outcomes. The multinomial distribution calculates the probability of a specific combination of outcomes occurring in n multinomial trials. Sampling is done with replacement during a multinomial experiment.

The multinomial’s distribution’s PDF is given by the following:

n = number of trials

m = number of possible outcomes in each trial

i as i goes from 1 to m = ith possible outcome of the 10 total possibilities. For example, if the three colors of marbles that can be drawn are red, blue, and green, then red is the 1^st possible outcome (i_red = 1), blue is the second possible outcome (i_blue = 2), etc.

k_i = the targeted number of successes for the ith possible outcome. For example, if the targeted number of blue marbles is 2 then k₂ = 2

p_i = the probability of each trial producing a success for the ith possible outcome. For example, if 5 out of 20 total marbles are blue, then p₃ = 5/20 = 0.25. These probabilities remain the same because each sample is replaced before the next trial.

(Click On Image To See a Larger Version)

There is no Excel formula to calculate multinomial probabilities.

This formula probably appears more onerous that it actually is. The following example will hopefully simplify it.

A bag contains 21 marbles whose colors are distributed as follows:

6 marbles are red

5 marbles are blue

10 marbles are green

9 marbles are individually drawn, evaluated, and then placed back into the bag before the next marble is drawn. Calculate the probability that the 9 trials will produce exactly the following results:

4 red marbles

2 blue marbles

3 green marbles

n = 9 trials

m = 3 possible outcomes (red, blue, green)

p₁ = 6/21 = probability of getting outcome 1 (red) in each trial

p₂ = 5/21 = probability of getting outcome 2 (blue) in each trial

p₃ = 10/21 = probability of getting outcome 3 (green) in each trial

k₁ = 4 = the targeted number of successes of outcome 1 (red) in n trials

k₂ = 2 = the targeted number of successes of outcome 2 (blue) in n trials

k₃ = 3 = the targeted number of successes of outcome 3 (green) in n trials

(Click On Image To See a Larger Version)

Pr(X₁=4, X₂=2, X₃=3)= 9! / (4!*2!*3!) * (6/21)⁴ * (5/21)² * (10/21)³

Pr(X₁=4, X₂=2, X₃=3)= 0.0514

There is a 5.14 percent probability of obtaining 4 red marbles, 2 blue marbles, and 3 green marbles in 9 successive random samples taken with replacement from a set of 6 red marbles, 5 blue marbles, and 10 green marbles.

The CDF of the multinomial distribution can be calculated but its formula quickly becomes unmanageably large as the number of outcomes and targeted number of each outcome grows.

Negative Binomial Distribution

Calculates probability that a specific number of failures will occur before a specified number of successes will. Each trial is a Bernoulli trial with the same probability of a successful outcome. Sampling is done with replacement during a negative binomial experiment. Like the binomial distribution, the probability of success on every trial is the same and the trials are independent, i.e., the outcome of one trial does not affect the outcome of any other trial.

The negative binomial distribution calculates the probability of a specific number of trials to obtain a fixed number of successes. The number of trials is varied and the number of successes is fixed.

The binomial distribution calculates the probability of a specific number of successes in a fixed number of trials. The number of successes is varied and the number of trials is fixed.

The negative binomial distribution’s PDF is given by the following:

x = targeted number of failures occurring before the rth success

r = total number of successes that occur. The rth success occurs immediately following the xth failure.

p = the probability of success on each trial

(Click On Image To See a Larger Version)

where

(Click On Image To See a Larger Version)

is the binomial coefficient and is sometimes referred to as “n choose k.”

The Excel formula to calculate the hypergeometric distribution’s PDF is the following:

Prior to Excel 2010: Pr(X = x) = NEGBINOMDIST(x, r, p)

Excel 2010 and beyond: Pr(X = x) = NEGBINOM.DIST(x, r, p, FALSE)

NEGBINOM.DIST can calculate both the PDF and the CDF. The earlier version of this formula prior to Excel 2010, NEGBINOMDIST, will only calculate the PDF.

The negative binomial distribution’s CDF is as follows:

(Click On Image To See a Larger Version)

Only Excel 2010 has a formula to calculate the negative binomial distribution’s CDF. The CDF can be calculated in Excel 2010 or later as follows:

Pr(X ≤ k) = NEGBINOM.DIST(x, r, p, TRUE)

Calculating the negative binomial distribution’s CDF in versions of Excel earlier than 2010 would require summing the PDF as i goes from 0 to x as follows:

(Click On Image To See a Larger Version)

A basic problem solved by the negative binomial distribution’s PDF in Excel is as follows:

A fair coin is continuously flipped. What is the probability that the 4^th head will occur immediately following the 10^th tail? A “head” will be counted as a success.

x = 10 = targeted number of failures (tails) occurring before the rth success (heads)

r = 4 = total number of successes (heads) that occur. The rth success occurs immediately following the xth failure (tails).

p = 0.5 = the probability of success on each trial

Pr(X = x) = NEGBINOM.DIST(x, r, p, FALSE)

Pr(X = 4) = NEGBINOM.DIST(10, 4, 0.5, FALSE) = 0.0174

There is a 1.74 percent probability the coin flip following the 10^th tail will produce the 4^th head.

Geometric Distribution

Calculates the probability that a success will occur for the first time on the nth Bernoulli trial. This is a special case of the negative binomial distribution in which the number of successes that will occur before a specified number of failures is equal to one. Sampling is done with replacement during a geometric experiment. Like the binomial distribution, the probability of success on every trial is the same and the trials are independent, i.e., the outcome of one trial does not affect the outcome of any other trial.

As mentioned, the geometric distribution is equivalent to the negative binomial distribution with r (the number of successes) set to one.

A basic problem solved by the geometric distribution’s PDF in Excel is as follows:

A fair coin is continuously flipped. What is the probability that the 1st head will occur immediately following the 3rd tail? A “head” will be counted as a success.

x = 3 = targeted number of failures (tails) occurring before the rth success (heads)

r = 1 = total number of successes (heads) that occur. The rth success occurs immediately following the xth failure (tails).

p = 0.5 = the probability of success on each trial

Pr(X = x) = NEGBINOM.DIST(x, r, p, FALSE)

Pr(X = 1) = NEGBINOM.DIST(3, 1, 0.5, FALSE) = 0.0625

There is a 6.25 percent probability the coin flip following the 3^rd tail will produce the 1^st head.

Excel Master Series Blog Directory

Click Here To See a List Of All

Statistical Topics And Articles In

This Blog

You Will Become an Excel Statistical Master!

4 comments:

DennielJanuary 22, 2021 at 9:26 AM
Do you get a lot of homework papers at university and you didn't expect this? To make it easier to adapt, I advise you to contact the online service for help.
ReplyDelete
Replies
AnonymousDecember 8, 2022 at 12:47 AM
PrivateVPN Up-and-coming service with spectacular speeds, good privacy, and security features. 코인카지노 A VPN encrypts all of your web site visitors and routes it by way of an intermediary server. The better part is that you could select the server location such that you could access geo-locked content material. Your device’s IP handle shall be masked and replaced with one from the country of your selection.
ReplyDelete
Replies
Angel17June 18, 2023 at 12:18 AM
Such an interesting and cool post. Thanks for sharing this one. pool screen repair North Port, FL
ReplyDelete
Replies
YongMakaoFebruary 8, 2026 at 1:49 AM
If you work with artificial intelligence even occasionally, you know how quickly your bookmarks can turn into chaos https://aidude.info/. New platforms appear every week, promising smarter automation, sharper content, faster results. That’s where AI Dude steps in — not as noise, but as a filter. AI Dude is a curated AI tools directory built for people who actually use these tools, not just read about them. Instead of endless scrolling and second-guessing, you get structured access to platforms that matter. Everything is organized so you can compare, explore, and choose without wasting half your day jumping between tabs.
ReplyDelete
Replies

Add comment

Become an Excel Statistical Master

Excel Master Series - MBA-level statistics - Over 1,100+ Pages of Easy-To-Follow Instructions in Excel

It's a Full
Easy-To-Follow
MBA Course in Business Statistics

ALL IN EXCEL

&

MUCH Clearer

Than Your Text

Book

Download the
1,100+ Page Excel Statistical Master now

More Easy-To-

Follow eManuals

That You Will

Master Quickly

Step-By-Step Optimization With Excel Solver

What's In It?

For anyone who wants to be operating at a high level with the Excel Solver quickly, this is the book for you. Step-By-Step Optimization With Excel Solver is a 200+ page .pdf e-manual of simple yet thorough explanations on how to use the Excel Solver to solve today’s most widely known optimization problems. Loaded with screen shots that are coupled with easy-to-follow instructions, this book will simplify many difficult optimization problems and make you a master of the Excel Solver almost immediately.

Here are just some of the Solver optimization problems that are solved completely with simple-to-understand instructions and screen shots in this e-manual:

• The famous “Traveling Salesman” problem using Solver’s Alldifferent constraint and the Solver’s Evolutionary method to find the shortest path to reach all customers. This also provides an advanced use of the Excel INDEX function.

• The well-known “Knapsack Problem” which shows how optimize the use of limited space while satisfying numerous other criteria.

• How to perform nonlinear regression and curve-fitting on the Solver using the Solver’s GRG Nonlinear solving method.

• How to solve the “Cutting Stock Problem” faced by many manufacturing companies who are trying to determine the optimal way to cut sheets of material to minimize waste while satisfying customer orders.

• Portfolio optimization to maximize return or minimize risk.

• Venture capital investment selection using the Solver’s Binary constraint to maximize Net Present Value of selected cash flows at year 0. Clever use of the If-Then-Else statements makes this a simple problem.

• How use Solver to minimize the total cost of purchasing and shipping goods from multiple suppliers to multiple locations.

• How to optimize the selection of different production machine to minimize cost while fulfilling an order.

• How to optimally allocate a marketing budget to generate the greatest reach and frequency or number of inbound leads at the lowest cost.

Step-By-Step Optimization With Excel Solver has complete instructions and numerous tips on every aspect of operating the Excel Solver. You’ll fully understand the reports and know exactly how to tweek all of the Solver’s settings for total custom use. This e-manual also provides lots of inside advice and guidance on setting up the model in Excel so that it will be as simple and intuitive as possible to work with.

All of the optimization problems in this book are solved step-by-step using a 6-step process that works every time. In addition to detailed screen shots and easy-to-follow explanations on how to solve every optimization problem in the book, a link is provided to download an Excel workbook that has all problems completed exactly as they are in this e-manual.

Step-By-Step Optimization With Excel Solver is exactly the e-manual you need if you want to be optimizing at an advanced level with the Excel Solver quickly.

*******************

Become an Excel Statistical Master

It's a Full
Easy-To-Follow
MBA Course in Business Statistics

ALL IN EXCEL

&

MUCH Clearer

Than Your Text

Book

Download the
1,100+ Page Excel Statistical Master now

More Easy-To-

Follow eManuals

That You Will

Master Quickly

*******************

Become an Excel Statistical Master

It's a Full
Easy-To-Follow
MBA Course in Business Statistics

ALL IN EXCEL

&

MUCH Clearer

Than Your Text

Book

Download the
1,100+ Page Excel Statistical Master now

More Easy-To-

Follow eManuals

That You Will

Master Quickly

*******************

Become an Excel Statistical Master

It's a Full
Easy-To-Follow
MBA Course in Business Statistics

ALL IN EXCEL

&

MUCH Clearer

Than Your Text

Book

Download the
1,100+ Page Excel Statistical Master now

Immediate, Absolute, No-Questions-Asked, Money-Back Guarantee If Not TOTALLY, 100% Satisfied. In Other Words, If Any Excel Master Series eManual That You've Purchased Here Does Not Provide Instructions That Are CRYSTAL CLEAR and EASY TO UNDERSTAND, You Get All Of Your Money Back Immediately and Keep the eManual. Guaranteed!

Meet The Author