Thursday, November 25, 2010

Using the Confidence

Interval in Excel To

Find Customer Preferences

This article will show you exactly how to calculate a 95% Confidence Interval in Excel of the proportion of customers on a commercial web site who prefer to pay with a credit card instead of Paypal. The 95% Confidence Interval is the interval for which we are 95% sure that the true proportion of credit card-preferring customers lies within.

The larger the sample size, the more accurate and smaller the confidence interval will be. We are taking a random sample of 1,000 Internet shoppers on this e-commerce web site.

The advantages of statistical analysis in Excel to solve business statistics problems is that most problems can be solved in just one or two steps and there is no more need to look anything up on Normal Distribution tables.

The 95% Confidence Interval calculated here is a Confidence Interval for Proportion, not Means. A Confidence Interval for Means uses an entirely different set of statistical formulas. Another article in this blog shows exactly how to calculate a Confidence Interval Means. Here is the problem:

Problem: A random sample of 1,000 Internet shoppers on one web site was taken. 70% preferred to pay with a credit card. 30% preferred to pay with PayPal. Determine the 95% Confidence Interval for the proportion of the general population of shoppers on that web site that prefers to pay with a credit card.

Before we begin solving this problem, we need to know about creating Confidence Intervals in Excel. Here is a brief lesson:

Confidence Interval of a Population Proportion

Creating a Confidence Interval for a population's proportion is very similar to creating a Confidence Interval for a population's mean. The only real difference is how the standard error is calculated. Everything else is the same.

First, the difference between using sampling to estimate a population mean and using sampling to estimate a population proportion will be explained below:

Mean Sampling vs. Proportion Sampling

What determines whether a mean is being estimated or a proportion is being estimated is the number of possible outcomes of each sample taken.

Proportion samples have only two possible outcomes. For example, if you are comparing the proportion of click-through visitors in two different PPC ads who converted (purchased), each sample has only two possible values; the visitor sampled either converted (purchased) or they didn't.

Mean samples have multiple possible outcomes. For example, if you are comparing the mean sales of salespeople in two different cities, each salesperson's sales sampled can have numerous values.

Below is a description of how to calculate a Confidence Interval for a population's proportion. Note that everything is almost the same as the calculation of the Confidence Interval for a mean, except sample standard error.

Levels of Confidence and Significance

Level of Significance, α ("alpha"), equals the maximum allowed percent of error. If the maximum allowed error is 5%, then α = 0.05.

Level of Confidence is selected by the user. A 95% Level is the most common. A 95% Confidence Level would correspond to a 95% Confidence Interval of the Proportion.

This would state that the actual population Proportion has a 95% probability of lying within the calculated interval. A 95% Confidence Level corresponds to a 5% Level of Significance, or α = 0.05. The Confidence Level therefore equals 1 - α.

Population Proportion vs. Sample Proportion

Population Proportion = µp = p (This is what we are trying to estimate)

Sample Proportion = pavg

Standard Deviation and Standard Error

Standard Deviation is not calculated during the creation of Confidence Interval for a population proportion.

Standard Error is an estimate of population Standard Deviation from data taken from a sample. Sample Standard Error will be an estimate taken from the sample proportion, pavg, and sample size, n. This is the major difference between calculating a Confidence Interval for a proportion and for a mean. Binomial distribution rules apply to proportions because a proportion sample has only two possible outcomes, just like a binomial variable.

Sample Standard Error of a Proportion
= σpavg = SQRT(p * q / n) ≈ spavg

= spavg = SQRT ( pavg * qavg / n )

p = Population proportion - This is the unknown that will be estimated with a Confidence Interval

q = 1 - p

n = Sample size

pavg = Sample proportion

qavg = 1 - pavg

Region of Certainty vs. Region of Uncertainty

Region of Certainty is the area under the Normal curve that corresponds to the required Level of Confidence. If a 95% percent Level of Confidence is required, then the Region of Certainty will contain 95% of the area under the Normal curve. The outer boundaries of the Region of Certainty will be the outer boundaries of the Confidence Interval.

The Region of Certainty, and therefore the Confidence Interval, will be centered about the mean. Half of the Confidence Interval is on one side of the mean and half on the other side.

Region of Uncertainty is the area under the Normal curve that is outside of the Region of Certainty. Half of the Region of Uncertainty will exist in the right outer tail of the Normal curve and the other half in the left outer tail. This is similar to the concept of the "two-tailed test" that is used in Hypothesis testing in further sections of this course. The concepts of one and two-tailed testing are not used when calculating Confidence Intervals. Just remember that the Region of Certainty, and therefore the Confidence Interval, are always centered about the mean on the Normal curve.

Relationship Between Region of Certainty, Uncertainty, and Alpha

The Region of Uncertainty corresponds to α ("alpha"). If α = 0.05, then that Region of Uncertainty contains 5% of the area under the Normal curve. Half of that area (2.5%) is in each outer tail. The 95% area centered about the mean will be the Region of Certainty. The outer boundaries of this Region of Certainty will be the outer boundaries of the 95% Confidence Interval.

The Level of Confidence is 95% and the Level of Significance, or maximum error allowed, is 5%.

Z Score

Z Score is the number of Standard Errors from the mean to outer right boundary of the Region of Certainty (and therefore to the outer right boundary of the Confidence Interval).

Standard Errors are used and not Standard Deviations because sample data is being used to calculate the Confidence Interval.

Z Score is calculated by the following Excel function:

Z Score(1-α) = NORMSINV (1 - α/2) - This will be discussed shortly.

Excel Functions Used When Calculating Confidence Interval for a Population Proportion

Note that Excel functions STDEV and AVERAGE are not used when working with proportions. The CONFIDENCE function is not used either.

Excel functions that are used are:

COUNT (Highlighted block of cells) = Sample size = n
This counts the number of cells in highlighted block

NORMSINV (1 - α/2) = Z Score(1 - α)

Number of Standard Errors from mean to boundary of Confidence Interval. Note that (1 - α/2)

= the entire area in the Normal curve to the left of outer right boundary of the Region of Certainty, or Confidence Interval. This includes the entire Region of Certainty and the half of the Region of Uncertainty that exists in the left tail.

For example:

Level of Confidence = 95% for a 95% Confidence Interval

Level of Significance = 5% (α = 0.05)

1 - α = 0.95 = 95%

Z Score95% = NORMSINV (1 - α/2) = NORMSINV (1 - .05/2) = NORMSINV(1 - 0.025)

Z Score95% = NORMSINV (0.975) = 1.96

The outer right boundary of the 95% Confidence Interval, and the Region of Certainty, is 1.96 Standard Errors from the mean. The left boundary is the same distance from the mean because the Confidence Interval is centered about the mean.

Formula for Calculating Confidence Interval Boundaries from Sample Data for a Population Proportion

Confidence Interval Boundaries = Sample proportion +/- Z Score(1-α) * Sample Standard Error

Confidence Interval Boundaries = pavg +/- Z Score(1-α) * spavg

Sample Proportion = pavg

Z Score(1 - α) = NORMSINV (1 - α/2)

Sample Standard Error of a Proportion = σpavg ≈ spavg = SQRT ( pavg * qavg / n )

Sample size = n = COUNT (Highlighted block of cells containing samples)

Confidence Interval Boundaries = pavg +/- Z Score(1-α) * spavg

***************************************************
Before we use any statistical test that relies upon the underlying population data (from which samples are drawn) to be normally distributed, we must first test the data for normality. The Confidence Interval is an interval in the Normal curve. This requires that the underlying population data be Normally distributed.
***************************************************

Always Test for Normality First

Normality tests should be performed on the independent random sample of Internet shoppers. This blog has numerous articles about how to perform normality testing and nonparametric testing if the data is not normally distributed.

***************************************************

Now we're ready to solve the problem and create a Confidence Interval in Excel.

Problem: Determine Confidence Interval of Internet Shoppers Who Prefer to Pay By Credit Card Based Upon Sample Data

A random sample of 1,000 Internet shoppers on one web site was taken. 70% preferred to pay with a credit card. 30% preferred to pay with PayPal. Determine the 95% Confidence Interval for the proportion of the general population of Internet shoppers on that web site that prefers to pay with a credit card.

Level of Confidence = 95% = 1 - α

Level of Significance = α = 0.05

Sample Size = n = 1,000

Sample Proportion = pavg = 0.70

qavg = 1 - pavg = 0.30

Sample Standard Error of a Proportion = σpavg ≈ spavg = SQRT ( pavg * qavg / n )

spavg = SQRT ( 0.70 * 0.30 / 1,000 ) = 0.014

Z Score(1 - α) = Z Score95% = NORMSINV (1 - α/2)

= NORMSINV (1 - 0.025) = NORMSINV (0.975) = 1.96

Width of Half the Confidence Interval = Z Score(1-α) * spavg

= 1.96 * 0.014 = 0.0274

Confidence Interval Boundaries = pavg +/- Z Score(1-α) * spavg

= 0.70 +/- (1.96) * (0.014)

0.70 +/- 0.0274 = 0.6726 to 0.7274 = 67.26% to 72.74%

We can be 95% certain that the percentage of Internet shoppers on this e-commerce web site who prefer to pay with a credit card instead of Paypal is between 67.26% and 72.74%.

Click Image To See Larger Version

*****************************************

Here is a link to this article if you wish to link to it:

Confidence Interval in Excel To Find Customer Preferences

If You Like This, Then Share It...

Hypothesis Test in Excel to Statistical Test Your Headlines

Excel Master Series Blog Directory

Statistical Topics and Articles In Each Topic

Become an Excel Statistical Master

Excel Master Series - MBA-level statistics - Over 1,100+ Pages of Easy-To-Follow Instructions in Excel

It's a Full
Easy-To-Follow
MBA Course in Business Statistics

ALL IN EXCEL

&

MUCH Clearer

Than Your Text

Book

Download the
1,100+ Page Excel Statistical Master now

More Easy-To-

Follow eManuals

That You Will

Master Quickly

Step-By-Step Optimization With Excel Solver

What's In It?

For anyone who wants to be operating at a high level with the Excel Solver quickly, this is the book for you. Step-By-Step Optimization With Excel Solver is a 200+ page .pdf e-manual of simple yet thorough explanations on how to use the Excel Solver to solve today’s most widely known optimization problems. Loaded with screen shots that are coupled with easy-to-follow instructions, this book will simplify many difficult optimization problems and make you a master of the Excel Solver almost immediately.

Here are just some of the Solver optimization problems that are solved completely with simple-to-understand instructions and screen shots in this e-manual:

• The famous “Traveling Salesman” problem using Solver’s Alldifferent constraint and the Solver’s Evolutionary method to find the shortest path to reach all customers. This also provides an advanced use of the Excel INDEX function.

• The well-known “Knapsack Problem” which shows how optimize the use of limited space while satisfying numerous other criteria.

• How to perform nonlinear regression and curve-fitting on the Solver using the Solver’s GRG Nonlinear solving method.

• How to solve the “Cutting Stock Problem” faced by many manufacturing companies who are trying to determine the optimal way to cut sheets of material to minimize waste while satisfying customer orders.

• Portfolio optimization to maximize return or minimize risk.

• Venture capital investment selection using the Solver’s Binary constraint to maximize Net Present Value of selected cash flows at year 0. Clever use of the If-Then-Else statements makes this a simple problem.

• How use Solver to minimize the total cost of purchasing and shipping goods from multiple suppliers to multiple locations.

• How to optimize the selection of different production machine to minimize cost while fulfilling an order.

• How to optimally allocate a marketing budget to generate the greatest reach and frequency or number of inbound leads at the lowest cost.

Step-By-Step Optimization With Excel Solver has complete instructions and numerous tips on every aspect of operating the Excel Solver. You’ll fully understand the reports and know exactly how to tweek all of the Solver’s settings for total custom use. This e-manual also provides lots of inside advice and guidance on setting up the model in Excel so that it will be as simple and intuitive as possible to work with.

All of the optimization problems in this book are solved step-by-step using a 6-step process that works every time. In addition to detailed screen shots and easy-to-follow explanations on how to solve every optimization problem in the book, a link is provided to download an Excel workbook that has all problems completed exactly as they are in this e-manual.

Step-By-Step Optimization With Excel Solver is exactly the e-manual you need if you want to be optimizing at an advanced level with the Excel Solver quickly.

*******************

Become an Excel Statistical Master

It's a Full
Easy-To-Follow
MBA Course in Business Statistics

ALL IN EXCEL

&

MUCH Clearer

Than Your Text

Book

Download the
1,100+ Page Excel Statistical Master now

More Easy-To-

Follow eManuals

That You Will

Master Quickly

*******************

Become an Excel Statistical Master

It's a Full
Easy-To-Follow
MBA Course in Business Statistics

ALL IN EXCEL

&

MUCH Clearer

Than Your Text

Book

Download the
1,100+ Page Excel Statistical Master now

More Easy-To-

Follow eManuals

That You Will

Master Quickly

*******************

Become an Excel Statistical Master

It's a Full
Easy-To-Follow
MBA Course in Business Statistics

ALL IN EXCEL

&

MUCH Clearer

Than Your Text

Book

Download the
1,100+ Page Excel Statistical Master now

Immediate, Absolute, No-Questions-Asked, Money-Back Guarantee If Not TOTALLY, 100% Satisfied. In Other Words, If Any Excel Master Series eManual That You've Purchased Here Does Not Provide Instructions That Are CRYSTAL CLEAR and EASY TO UNDERSTAND, You Get All Of Your Money Back Immediately and Keep the eManual. Guaranteed!

Meet The Author

More Easy-To-

Follow eManuals

That You Will

Master Quickly

*******************

It's a Full
Easy-To-Follow
MBA Course in Business Statistics

ALL IN EXCEL

&

MUCH Clearer

Than Your Text

Book

Download the
1,100+ Page Excel Statistical Master now

More Easy-To-

Follow eManuals

That You Will

Master Quickly

*******************

It's a Full
Easy-To-Follow
MBA Course in Business Statistics

ALL IN EXCEL

&

MUCH Clearer

Than Your Text

Book

Download the
1,100+ Page Excel Statistical Master now

More Easy-To-

Follow eManuals

That You Will

Master Quickly

*******************

It's a Full
Easy-To-Follow
MBA Course in Business Statistics

ALL IN EXCEL

&

MUCH Clearer

Than Your Text

Book

Download the
1,100+ Page Excel Statistical Master now

More Easy-To-

Follow eManuals

That You Will

Master Quickly

*******************

It's a Full
Easy-To-Follow
MBA Course in Business Statistics

ALL IN EXCEL

&

MUCH Clearer

Than Your Text

Book

Download the
1,100+ Page Excel Statistical Master now

More Easy-To-

Follow eManuals

That You Will

Master Quickly

*******************

It's a Full
Easy-To-Follow
MBA Course in Business Statistics

ALL IN EXCEL

&

MUCH Clearer

Than Your Text

Book

Download the
1,100+ Page Excel Statistical Master now

More Easy-To-

Follow eManuals

That You Will

Master Quickly

*******************

It's a Full
Easy-To-Follow
MBA Course in Business Statistics

ALL IN EXCEL

&

MUCH Clearer

Than Your Text

Book

Download the
1,100+ Page Excel Statistical Master now

More Easy-To-

Follow eManuals

That You Will

Master Quickly

*******************

1 comment:

erikafrommOctober 4, 2024 at 1:14 AM
Greetings! Non-violent action training equips community members with essential skills for impactful activism. These sessions focus on peaceful protest and advocacy, promoting constructive dialogue and collaborative engagement. The Resistance Center https://theresistancecenter.org/ fosters non-violence, cultivating a respectful and understanding environment where individuals can express their beliefs while upholding community unity. This approach amplifies individual voices and strengthens solidarity across the community.

Thursday, November 25, 2010

Confidence Interval in Excel To Find Customer Preferences

Using the Confidence Interval in Excel To Find Customer Preferences

1 comment:

Blog Directory

Become an Excel Statistical Master

Become an Excel Statistical Master

Become an Excel Statistical Master

Advanced Statistical and Solver Optimization Consulting Service - Fast and Completed in Excel

Become an Excel Statistical Master

Excel Master Series Blog Directory

Using the Confidence

Interval in Excel To

Find Customer Preferences