Monday, June 2, 2014

1-Sample z-Test in 4 Steps in Excel 2010 and Excel 2013

This is one of the following four articles on z-Tests in Excel

Overview of Hypothesis Tests Using the Normal Distribution in Excel 2010 and Excel 2013

One-Sample z-Test in 4 Steps in Excel 2010 and Excel 2013

2-Sample Unpooled z-Test in 4 Steps in Excel 2010 and Excel 2013

Overview of the Paired (Two-Dependent-Sample) z-Test in 4 Steps in Excel 2010 and Excel 2013

One-Sample z-Test in 4

Steps in Excel 2010 and

Excel 2013

This hypothesis test determines whether the mean of the population from which the sample was taken is equal to (two-tailed test) or else greater than or less than (one-tailed test) than a constant. This constant is often the known mean of a population from which the sample may have come from. The constant is the constant on the right side of the Null Hypothesis.

x_bar = Observed Sample Mean

z test, z-test power, normal distribution, t test, t-test, excel, excel 2010, excel 2013, statistics
(Click On Image To See a Larger Version)

z test, z-test power, normal distribution, t test, t-test, excel, excel 2010, excel 2013, statistics (Click On Image To See a Larger Version)

Null Hypothesis H0: x_bar = Constant

The Null Hypothesis is rejected if any of the following equivalent conditions are shown to exist:

1) The observed x_bar is beyond the Critical Value.

2) The z Score (the Test Statistic) is farther from zero than the Critical z Value.

3) The p value is smaller than α for a one-tailed test or α/2 for a two-tailed test.

 

Example of a One-Sample, Two-

Tailed z-Test in Excel

This problem is very similar to the problem solved in the t-test section for a one-sample, one-tailed t-test. Similar problems were used in each of these sections to show the similarities and also contrast the differences between the one-sample z-Test and t-test as easily as possible.

This problem compares average monthly sales from one fast food chain’s retail stores in one region with the average monthly sales of all of the fast food chain’s retails in the entire country. The region being evaluated has more than 1,000 very similar stores. The national mean monthly retail store sales equals $186,000. The standard deviation of the monthly sales for the entire population of stores is $30,000.

Determine with at least 95% certainty whether the average monthly sales of all of the fast food chain’s stores in the one region is different than the national average monthly sales of all of the fast food chain’s stores.

The data sample of sales for the month for a random sample of 40 retail stores in a region is as follows:

z test, z-test power, normal distribution, t test, t-test, excel, excel 2010, excel 2013, statistics
(Click On Image To See a Larger Version)

 

Summary of Problem Information

x_bar = sample mean = AVERAGE() = 200,000

µ = national (population) mean = 186,000

α = 1-Level of Certainty Required = 1 – 0.95 = 0.05

s = sample standard deviation = Not Known and not needed for a z-Test

σ (Greek letter “sigma”) = population standard deviation = 30,000

n = sample size = COUNT() = 40

z test, z-test power, normal distribution, t test, t-test, excel, excel 2010, excel 2013, statistics

SE = Standard Error = σ / SQRT(n) = 30,000 / SQRT(40) = 4,743

Note that this calculation of the Standard Error using the population standard deviation, σ, is the true Standard Error. If the sample standard error, s, were used in place of σ, the Standard Error calculated would be an estimate of the true Standard Error. The z-Test requires the population standard deviation but the t-test uses the sample standard deviation as an estimate of the population standard deviation.

As with all Hypothesis Tests of Mean, we must satisfactorily answer these two questions and then proceed to the four-step method of solving the hypothesis test that follows.

 

The Initial Two Questions That Must be Answered Satisfactorily

What Type of Test Should Be Done?

Have All of the Required Assumptions For This Test Been Met?

 

The Four-Step Method For Solving All Hypothesis Tests of Mean

Step 1) Create the Null Hypothesis and the Alternative Hypothesis

Step 2 – Map the Normal or t-Distribution Curve Based on the Null Hypothesis

Step 3 – Map the Regions of Acceptance and Rejection

Step 4 – Perform the Critical Value Test, the p Value Test, or the Critical t Value Test

 

The Initial Two Questions That Need To Be Answered Before Performing the Four-Step Hypothesis Test of Mean are as follows:

 

Question1) Type of Test?

a) Hypothesis Test of Mean or Proportion?

This is a Hypothesis Test of Mean because each individual observation (each sampled monthly retail store sales figure) within the sample can have a wide range of values. Data points for Hypothesis Tests of Proportion are binary: they can take only one of two possible values.

 

b) One-Sample or a Two-Sample Test?

This is a one-sample hypothesis test because only one sample containing monthly sales figures from forty stores has been taken and is being compared to the national monthly retail store average for the same month.

 

c) Independent (Unpaired) Test or a Dependent (Paired) Test?

It is neither. The designation of “paired” or “unpaired” applies only for two-sample hypothesis tests.

 

d) One-Tailed or Two-Tailed Hypothesis?

The problem asks to determine whether the forty-store monthly average is simply different than the national average. This is a non-directional inequality making this hypothesis test a two-tailed test. If the problem asked whether the forty-store average was either higher or was lower, the inequality would be directional and the resulting hypothesis test would be a one-tailed test. A two-tailed test is more stringent than a one-tailed test.

 

e) t-Test or z-Test?

A hypothesis test of means can be performed if the distribution of the Test Statistic under the Null Hypothesis can be approximated by either the normal distribution or the t-Distribution.

A z-Test is a statistical test in which the distribution of the Test Statistic under the Null Hypothesis can be approximated by the normal distribution. A t-test is a statistical test in which the distribution of the Test Statistic under the Null Hypothesis can be approximated by the t-Distribution.

This hypothesis test of mean can be performed as z-Test because sample size is large (n = 40) and the population standard deviation (σ = 30,000) is known. The large sample size and known population standard deviation ensure that the distribution of the sample mean (and therefore Test Statistic, which is derived from the sample mean) can be approximated by the normal distribution under the Null Hypothesis.

It should be noted that a one-sample t-Test can always be used in place of a one-sample z-Test. All z-Tests can be replaced be their equivalent t-Tests. As a result, some major commercial statistical software packages including the well-known SPSS provide only t-Tests and no direct z-Tests.

This hypothesis test is a t-Test that is one-sample, two-tailed hypothesis test of mean as long as all required assumptions have been met.

 

Question 2) Test Requirements Met?

a) Normal Distribution of Test Statistic

The normal distribution can be used to map the distribution of the sample mean (and therefore the Test Statistic, which is derived from the sample mean) only if the following conditions exist:

1) Population standard deviation, σ, is known

Population standard deviation, σ, is one of the two required parameters needed to fully describe a unique normal distribution curve and must therefore be known in order to perform a z-Test (which uses the normal distribution).

and

2) Sample size is large (n > 30)

The Central Limit Theorem states that if a number of large, random samples of the same size were taken from the same population, the means of the samples would be normal-distributed. If the sample mean is normal-distributed, the Test Statistic, which equals (sample mean – Constant ) / SE, will also be normal-distributed because it is derived from the sample mean.

If sample size is large, neither the normality of the population nor the normality of the sample data has to be confirmed.

The sample man and therefore the Test Statistic are normal-distributed because sample size is large (n = 40) and the population standard deviation (σ = 30,000) is known.

 

We now proceed to complete the four-step method for solving all Hypothesis Tests of Mean. These four steps are as follows:

Step 1) Create the Null Hypothesis and the Alternative Hypothesis

Step 2 – Map the Normal or t-Distribution Curve Based on the Null Hypothesis

Step 3 – Map the Regions of Acceptance and Rejection

Step 4 – Determine Whether to Accept or Reject theNull Hypothesis By Performing the Critical Value Test, the p Value Test, or the Critical t Value Test

Proceeding through the four steps is done is follows:

 

Step 1 – Create the Null and Alternative Hypotheses

The Null Hypothesis is always an equality that states that the items being compared are the same. In this case, the Null Hypothesis would state that the average monthly sales of all stores in the region (the population from which the forty-store sample was taken) is not different than the national monthly store average sales, µ, which is $186,000. We will use the variable x_bar to represent the sample mean of the forty stores. The Null Hypothesis is as follows:

H0: x_bar = Constant = 186,000

The Constant is quite often the known population mean, µ, to which the sample mean is being compared.

The Alternative Hypothesis is always in inequality and states that the two items being compared are different. This hypothesis test is trying to determine whether the average monthly sales of all stores in the region (the population from which the forty-store sample was taken) is merely different than the national monthly store average sales, µ, which is $186,000.

The Alternative Hypothesis is as follows:

H1: x_bar ≠ Constant

H1: x_bar ≠ 186,000

The Alternative Hypothesis is non-directional (“not equal” instead of “greater than” or “less than”) and the hypothesis test is therefore a two-tailed test. It should be noted that a two-tailed test is more rigorous (requires a greater differences between the two entities being compared before the test shows that there is a difference) than a one-tailed test.

It is important to note that the Null and Alternative Hypotheses refer to the means of the populations from which the samples were taken. A one-sample t-Test determines whether to reject or fail to reject the Null Hypothesis that states that that population from which the sample was taken (the entire region) has a mean equal to the Constant. The Constant in this case is equal to known national average.

Parameters necessary to map the distributed variable, x_bar, are the following:

σ (Greek letter “sigma”) = population standard deviation = 30,000

n = sample size = COUNT() = 40

z test, z-test power, normal distribution, t test, t-test, excel, excel 2010, excel 2013, statistics

SE = Standard Error = σ / SQRT(n) = 30,000 / SQRT(40) = 4,743

These parameters used to map the distributed variable, x_bar, to the normal distribution are as follows:

 

Step 2 – Map the Distributed Variable to Normal Distribution

A z-Test can be performed if the sample mean, and the Test Statistic (the z Score) are distributed according to the normal Distribution. If the sample size is large and the population standard deviation is known, the sample mean and closely-related Test Statistic are distributed according to the normal Distribution.

The sample mean x_bar is distributed according to the normal distribution. The distributed variable would in this case be the sample mean, x_bar.

Mapping this distributed variable to a normal Distribution curve is shown as follows:

z test, z-test power, normal distribution, t test, t-test, excel, excel 2010, excel 2013, statistics (Click On Image To See a Larger Version)

This non-standardized normal Distribution curve has its mean set to equal the Constant taken from the Null Hypothesis, which is:

H0: x_bar = Constant = 186,000

This non-standardized normal Distribution curve is constructed from the following parameters:

Mean = 186,000

Standard Error = 4,743

Distributed Variable = x_bar

 

Step 3 – Map the Regions of Acceptance and Rejection

The goal of a hypothesis test is to determine whether to reject or fail to reject the Null Hypothesis at a given level of certainty. If the two things being compared are far enough apart from each other, the Null Hypothesis (which states that the two things are not different) can be rejected. In this case we are trying to show graphically how different the sample mean, x_bar = $200.000, is from the national average of $186,000.

The non-standardized normal distribution curve can be divided up into two types of regions: the Region of Acceptance and the Regions of Rejection. A two-tailed test has the Region of Rejection split between the two outer tails. A boundary between a Region of Acceptance and a Region of Rejection is called a Critical Value.

If the sample mean’s value of x_bar = 200,000 falls into a Region of Rejection, the Null Hypothesis is rejected. If the sample mean’s value of x_bar = 200,000 falls into a Region of Acceptance, the Null Hypothesis is not rejected.

The total size of the Region of Rejection is equal to Alpha. In this case Alpha, α, is equal to 0.05. This means that the Region of Rejection will take up 5 percent of the total area under this t-Distribution curve.

This 5 percent is divided up between the two outer tails. Each outer tail contains 2.5 percent of the curve that is the Region of Rejection.

The boundary between Regions of Acceptance and Regions of Rejection are called Critical Values. The locations of these Critical values need to be calculated.

 

Calculate Critical Values

A Critical Value is the boundary between a Region of Acceptance and a Region of Rejection. In the case of a two-tailed test, the Region of rejection is split between two outer tails. There are therefore two Critical Values.

The Critical Value is the boundary on either side of the curve beyond which 2.5 percent of the total area under the curve exists. In this case both Critical Values can be found by the following:

Critical Values = Mean ± (Number of Standard Errors from Mean to Region of Rejection) * SE

Critical Values = Mean ± NORM.S.INV(1-α/2) * SE

Critical Values = 186,000 ± NORM.S.INV(1 - 0.05/2) * 4,743

Critical Values = 186,000 ± NORM.S.INV(0.975) * 4,743

Critical Values = 186,000 ± 9,296

Critical Values = 176,703 and 195,297

The Region of Rejection is therefore everything that is to the right of 195,297 and everything to the left of 176,703.

An Excel-generated distribution curve with the blue Region of Acceptance and the yellow Regions of Rejection is shown is as follows:

z test, z-test power, normal distribution, t test, t-test, excel, excel 2010, excel 2013, statistics (Click On Image To See a Larger Version)

 

Step 4 – Determine Whether to Reject Null Hypothesis

The object of a hypothesis test is to determine whether to accept or reject the Null Hypothesis. There are three equivalent-Tests that determine whether to accept or reject the Null Hypothesis. Only one of these tests needs to be performed because all three provide equivalent information. The three tests are as follows:

 

1) Compare x-bar With Critical Value

Reject the Null Hypothesis if the sample mean, x_bar = 200,000, falls into the Region of Rejection.

Equivalently, reject the Null Hypothesis if the sample mean, x_bar, is further from the curve’s mean of 186,000 than the Critical Value.

The Critical Values have been calculated to be 176,703 on the left and 195,297 on the right. X_bar (200,000) is further from the curve mean (186,000) than the right Critical Value (195,297). The Null Hypothesis would therefore be rejected.

 

2) Compare the z Score With Critical z Value

The z Score corresponds to the standardized value of the sample mean, x_bar = 200,000. The z Score is the number of Standard Errors that x_bar is from the curve’s mean of 186,000.

The Critical z Value is the number of Standard Errors that the Critical Value is from the curve’s mean.

Reject the Null Hypothesis if the z Score is farther from the standardized mean of zero than the Critical z Value.

Equivalently, reject the Null Hypothesis if the z Score is farther from the standardized mean of zero than the Critical z Value.

z test, z-test power, normal distribution, t test, t-test, excel, excel 2010, excel 2013, statistics (Click On Image To See a Larger Version)

The Constant is the Constant from the Null Hypothesis (H0: x_bar = Constant = 186,000)

Z Score (Test Statistic) = (200,000 – 186,000)/4,743

Z Score (Test Statistic) = 2.951

This means that the sample mean, x_bar, is 2.951 standard errors from the curve mean (186,000).

Two-tailed Critical z Values = ±NORM.S.INV(1-α/2)

Two-tailed Critical z Values = ±NORM.S.INV(1-0.05/2)

Two-tailed = ±NORM.S.INV(0.975) = ±1.9599

This means that the Region of Rejection for this two-tailed hypothesis test in either tail begins at 1.9599 standard errors from (to the left of and to the right of) the standardized mean of zero.

This means that the boundaries of the Region of Rejection are 1.9599 standard errors from the curve mean (186,000) on each side since this is a two-tailed test.

The Null Hypothesis is rejected because the z Score (+2.951) is further from the standardized mean of zero than the Critical z Values (±1.9599). This is another indication that x_bar (200,000) is in the Region of Rejection.

 

3) Compare the p Value With Alpha

The p Value is the percent of the curve that is beyond x_bar (200,000). If the p Value is smaller than Alpha/2, the Null Hypothesis is rejected.

p Value =MIN(NORM.S.DIST(z Score,TRUE),1-NORM.S.DIST(z Score,TRUE))

p Value =MIN(NORM.S.DIST(2.951,TRUE),1-NORM.S.DIST(2.951,TRUE))

p Value = 0.0016

The p Value (0.0016) is smaller than Alpha/2 (0.025) Region of Rejection in the right tail and we therefore reject the Null Hypothesis. The following Excel-generated graph shows that the red p Value (the curve area beyond x_bar) is smaller than the yellow Alpha, which is the 5 percent Region of Rejection split between both outer tails.

z test, z-test power, normal distribution, t test, t-test, excel, excel 2010, excel 2013, statistics (Click On Image To See a Larger Version)

 

Excel Formula Shortcut to Performing a One-Sample z-Test

This problem could also be quickly solved with the following Excel z-Test formula:

p Value =MIN(Z.TEST(array,Constant,σ),1-Z.TEST(array,Constant,σ))

It should be noted that when the Constant is positive, the p Value Excel formula is

p Value = Z.TEST(array,Constant,σ),

When the Constant is negative, the p Value Excel formula is

p Value = 1-Z.TEST(array, Constant,σ))

The Constant is taken from the Null Hypothesis and is equal to 0.

The Null Hypothesis is as follows:

H0: x_bar = Constant = 186,000

The Excel z-Test formula produces the p Value as follows:

z test, z-test power, normal distribution, t test, t-test, excel, excel 2010, excel 2013, statistics (Click On Image To See a Larger Version)

Note that the array can be spread across two columns as is done here. The array does not have to be entirely contained in a single column in this case.

 

Excel Master Series Blog Directory

 

Click Here To See a List Of All

Statistical Topics And Articles In

This Blog

 

You Will Become an Excel Statistical Master!

No comments:

Post a Comment