## Tuesday, May 27, 2014

### Sign Test in Excel As A Paired t-Test Alternative

This is one of the following seven articles on Paired (2-Sample Dependent) t-Tests in Excel

Paired t-Test in 4 Steps in Excel 2010 and Excel 2013

Excel Normality Testing of Paired t-Test Data

Paired t-Test Excel Calculations, Formulas, and Tools

Paired t-Test – Effect Size in Excel 2010, and Excel 2013

Paired t-Test – Test Power With G-Power Utility

Wilcoxon Signed-Rank Test in 8 Steps As a Paired t-Test Alternative

Sign Test in Excel As A Paired t-Test Alternative

# The Sign Test in Excel As an Alternative to the Paired t-Test

The Sign Test along with the Wilcoxon Signed-Rank Test are nonparametric alternatives to the paired t-Test when the normality of the difference sample cannot be verified and the sample size is small.

The Wilcoxon Signed-Rank Test is significantly more powerful than the Sign Test but has a requirement of symmetrical distribution about a median for the difference sample data (the data set of the sample points minus the Constant of the Null Hypothesis). The Wilcoxon Signed-Rank Test is based upon a normal approximation of its Test Statistic’s distribution. This requires that the difference sample be reasonably symmetrically distributed about a median.

The Sign Test has no requirements regarding the distribution of data but, as mentioned, is significantly less powerful than the Wilcoxon Signed-Rank Test.

The Sign Test counts the number of positive and negative non-zero differences between difference sample data and the Constant from the Null Hypothesis in the paired t-Test. In this case that Constant = 0 because the Null Hypothesis of the one-tailed t-Test is as follows:

H0: x_bardiff = Constant = 0

The after-minus-before difference sample for this problem is calculated as follows:

(Click On Image To See Larger Version)

The final difference sample is created by subtracting the Constant from the Null Hypothesis, 0, from the after-minus-before difference as follows:

(Click On Image To See Larger Version)

A count of positive and negative differences in this sample is taken as follows:

(Click On Image To See Larger Version)

The minimum count of positive or negative non-zero differences is designated as the Test Statistic W for the Sign Test. Test Statistic W is named after Frank Wilcoxon who developed the test.

The objective of the one-tailed, paired t-Test was to determine whether to reject or fail to reject the Null Hypothesis that states that the mean difference between the number of clerical errors before and after the training for all employees is equal to 0.

If the mean difference is equal to 0, then the probability of a difference being positive (greater than zero) is the same as the probability of being negative (less than zero). This probability is 50 percent.

The Null Hypothesis for this one-tailed, Sign Test states that the probability of a difference being positive (p) is 50 percent. This can be expressed as follows:

H0: p=0.5

The Alternative Hypothesis would state the following:

H1: p≠0.5

Each non-zero difference is classified as either positive or negative. This is a binary event because the classification of each difference has only two possible outcomes: the non-zero difference is either positive or negative.

The distribution of the outcomes of this binary event can be described by the binomial distribution as long as the following two conditions exist:

1) Each binary trial is independent.

2) The data from which the differences are derived are at least ordinal. The data can be ratio, interval, ordinal, but not nominal. The differences of “less than” and “greater than” must be meaningful even if the amount of difference is not, as would be the case with ordinal data but not with nominal data.

3) Each binary trial has the same probability of a positive outcome.

All of these conditions are met because of the following:

1) Each sample taken is independent of any other sample.

2) The differences are derived from continuous (either ratio or interval) data.

3) The proportion of positive differences versus negative differences is assumed to be constant in the population from which the sample of differences was derived.

The counts of the positive and negative differences both follow the binomial distribution. The lowest count, whether it is the count of positive differences or the count of negative differences, is designated as W, the Test Statistic. This Test Statistic follows the binomial distribution because W represents the count of positive or negative outcomes of independent binary events that all have the same probability of a positive outcome.

As stated, the Null Hypothesis of this one-tailed, paired Sign Test is the following:

H0: p=0.5

If the training program was successful, there would be a reduction in the number of clerical errors. If the number of clerical errors were reduced, there would be a larger number of negative after-minus-before differences than positive differences.

The difference count indicates that there are 11 negative differences and 6 positive differences. These counts are distributed according to the binomial distribution that has a probability of positive outcome, p, equal to 0.5 and the total number of trials, N, equal to 17. As Excel-generated graph of this binomial distribution is shown as follows:

(Click On Image To See Larger Version)

This test evaluates whether a count of 11 negative differences is significant at an alpha level of 0.05. The area under the PDF curve beyond 11 differences is equal to the probability that this outcome did not occur by chance. This is the p value for this test.

The Null Hypothesis would be rejected if the p Value calculated from this test is less than alpha, which is customarily set at 0.05.

The binomial distribution is symmetric and the curve area in the right tail beyond 11 differences is the same as the curve area in the left tail beyond 6 differences.

Test Statistic W in the Sign Test is always set to the lower count. The area outside the lower count is equal to the area outside the upper count. That area is the p Value for the one-tailed Sign test.

That p Value is equal to the probability that the number of positive outcomes is less than W = 6 if the total number of nonzero counts = N = 17 and every binary trial has the same probability of a positive outcome = p = 0.5.

Given that variable x is binomially distributed, the CDF (Cumulative Distribution Function) of the x ≤ X is calculated in Excel as follows:

p Value = F(X;n,p) = BINOM.DIST(X, n, p, 1)

This calculates the probability that up to X number of positive outcomes will occur in n total binary trials if the probability of a positive outcome is p for every trial. “1” specifies that the Excel formula will calculate the CDF and not the PDF.

Therefore the following can be calculated:

p Value = Pr (p = No. of Positive Differences ≤ W |p=0.5, N = Total No. of Non-Zero Differences) =

p Value = BINOM.DIST(W, N, p,1)

p Value = BINOM.DIST(7,20,0.5,1) = 0.1316

This is shown in the following Excel-generated graph of the PDF of the binomial distribution for this sign test. The parameters of this binomial distribution are Total Trials = N = 17 and the Probability of a Positive Outcome of Each Trial, p, equal 0.5.

(Click On Image To See Larger Version)

The p Value (0.1661) is larger than alpha (set at 0.05). The Null Hypothesis is therefore not rejected at this alpha level. This would be equivalent to stating that there is not enough evidence to reject the Null Hypothesis which ultimately states that there has been no reduction in clerical errors as a result of the training program.

This example demonstrates how much less powerful the one-sample Sign Test is than the paired t-Test or the Wilcoxon Signed-Rank Test. The Sign Test did not come close to detecting a difference at the same alpha level that the other two tests did.

Excel Master Series Blog Directory

Statistical Topics and Articles In Each Topic

• Histograms in Excel
• Bar Chart in Excel
• Combinations & Permutations in Excel
• Normal Distribution in Excel
• t-Distribution in Excel
• Binomial Distribution in Excel
• z-Tests in Excel
• t-Tests in Excel
• Hypothesis Tests of Proportion in Excel
• Chi-Square Independence Tests in Excel
• Chi-Square Goodness-Of-Fit Tests in Excel
• F Tests in Excel
• Correlation in Excel
• Pearson Correlation in Excel
• Spearman Correlation in Excel
• Confidence Intervals in Excel
• Simple Linear Regression in Excel
• Multiple Linear Regression in Excel
• Logistic Regression in Excel
• Single-Factor ANOVA in Excel
• Two-Factor ANOVA With Replication in Excel
• Two-Factor ANOVA Without Replication in Excel
• Randomized Block Design ANOVA in Excel
• Repeated-Measures ANOVA in Excel
• ANCOVA in Excel
• Normality Testing in Excel
• Nonparametric Testing in Excel
• Post Hoc Testing in Excel
• Creating Interactive Graphs of Statistical Distributions in Excel
• Solving Problems With Other Distributions in Excel
• Optimization With Excel Solver
• Chi-Square Population Variance Test in Excel
• Analyzing Data With Pivot Tables
• SEO Functions in Excel
• Time Series Analysis in Excel
• VLOOKUP