This is one of the following seven articles on Logistic Regression in Excel
Logistic Regression in 7 Steps in Excel 2010 and Excel 2013
R Square For Logistic Regression Overview
Excel R Square Tests: Nagelkerke, Cox and Snell, and Log-Linear Ratio in Excel 2010 and Excel 2013
Likelihood Ratio in Excel
Is More Effective Than
Wald Statistic To
Determine if the Variable
Coefficients Are Significant
An essential part of linear regression analysis is the determination of whether the calculated coefficients of input variables are statistically significant. A variable coefficient is considered to be statistically significant if the probability that it has a nonzero value is less than the specified level of significance. This probability is shown in the Excel linear regression output as the P-value next to the coefficient. The normal level of significance is α = 0.05. A P-value of less than 0.05 indicates that the variable coefficient is statistically significant if α = 0.05.
The significance of the variable coefficients b1, b2 , …, bk in logistic regression is calculated by different methods.
The Wald Statistic
Until recently the most common metric used to evaluate the significance of a variable coefficient in binary logistic regression was the Wald Statistic. The Wald Statistic for each coefficient is calculated by the following formula:
Wald Statistic = (Coefficient)2 / (Standard Error of Coefficient)2
The Standard errors of the coefficients are equal to the square roots of the diagonals of the covariance matrix of the coefficients. This requires a bit of matrix work to compute. The Wald Statistic is approximately distributed according to the Chi-Square distribution with one degree of freedom. The p Value for the Wald Statistic is calculated as follows:
p Value = CHIDIST(Wald Statistic,1)
In Excel 2010 and later, this formula can be replaced by the following:
p Value = CHISQ.DIST.RT(Wald Statistic, 1)
The coefficient is considered statistically significant is less than the specified level of significance, which is commonly set at 0.05.
The 95 percent confidence interval for the coefficient is calculated in Excel as follows:
95 percent C.I. = Coefficient ± S.E.* NORM.S.INV(1 – α/2)
Problems With the Wald Statistic
The Wald Statistic is currently the main logistic regression metric of variable coefficient significance calculated by well-known statistical packages such as SAS and SPSS. The reliability of the Wald Statistic is, however, considered questionable. If the case that a large coefficient is produced, the standard error of the coefficient can be inflated. This will result in an undersized Wald Statistic, which could lead to a conclusion that a significant coefficient was not significant. This is a false negative, which is a Type 2 Error. A false positive is a Type 1 Error. An additional reliability issue occurs with the Wald Statistic when sample size is small. The Wald Statistic is often biased for small sample sizes.
Due to the reliability issues associated with the Wald Statistic, the preferred method to evaluate the significance of logistic regression variable coefficients is the Likelihood Ratio calculated for each coefficient.
The Likelihood Ratio
The Likelihood Ratio is a statistical test that compares the likelihood of obtaining the data using a full model with the likelihood of obtaining the same data with a model that is missing the coefficient being evaluated. The Likelihood Ratio for logistic regression is a Chi-Square test that compares the goodness of fit of two models when one of the models is a subset of the other.
The general formula for the Likelihood Ratio is as follows:
Likelihood RatioReduced_Model = -2*MLLReduced_Model + 2*MLLFull_Model
MLLFull_Model is equal to the MLLm that was initially calculated to determine the values of all coefficients b1, b2, …, bk that create the most accurate P(X). This has already been calculated to equal the following:
MLLm = -6.6545
MLLReduced_Model is simply the calculation of MLL with the coefficient being evaluated set to zero.
For example, MLLb1=0 would be the MLL calculated with coefficient b1 set to zero.
MLLb2=0 would be the MLL calculated with coefficient b\2 set to zero.
The Likelihood Ratio is approximately distributed according to the Chi-Square distribution with the degrees of freedom equal to number of coefficients that have been set to zero. This will be one.
The p value of the Likelihood Ratio determines whether the removal of the coefficient made a real difference. If the p value is lower than the specified level of significance (usually 0.05) the coefficient is considered significant.
The p value of the Likelihood Ratio is calculated with the following Excel formula:
p Value = CHISQ.DIST.RT(MLLReduced_Model,1)
Calculating the Likelihood Ratio to Determine Whether Coefficient b1 Is Significant With Excel Solver
The Solver will be used to calculate MLLb1=0. The p Value of MLLb1=0 (CHISQ.DIST.RT(MLLb1=0,1) will determine whether coefficient b1 is significant. Setting the value of coefficient b1 to zero before calculating MLLb1=0 with the Solver is done as follows:
(Click On Image To See a Larger Version)
The Solver dialogue is configured as follows to calculate MLLb1=0. Note that b1 (cell C3) is no longer a Solver Decision Variable.
(Click On Image To See a Larger Version)
Running the Solver produces the following calculation of MLLb1=0.
(Click On Image To See a Larger Version)
MLLm = MLLFull_Model = -6.6546
MLLb1=0 = MLLReduced_Model = -10.9104
Likelihood RatioReduced_Model = -2*MLLReduced_Model + 2*MLLFull_Model
Likelihood Ratio b1 = LR b1 = -2*MLLb1=0 + 2*MLLm
LR b1 = 8.5117
This statistic is distributed according to the Chi-Square distribution with its degrees of freedom equal to the difference between the number of variables in the full model and the reduced model. In this case that difference is one variable so df = 1.
p Value = CHIDIST(LR b1,1) = CHIDIST(8.5117,1) = 0.0035
The very low p Value indicates that LR b1 is statistically significant. Variable b1 is therefore significant.
Calculating the Likelihood Ratio to Determine Whether Coefficient b2 Is Significant With Excel Solver
The Solver will be used to calculate MLLb2=0. The p Value of MLLb2=0 (CHISQ.DIST.RT(MLLb2=0,1) will determine whether coefficient b2 is significant. Setting the value of coefficient b\2 to zero before calculating MLLb2=0 with the Solver is done as follows:
(Click On Image To See a Larger Version)
The Solver dialogue is configured as follows to calculate MLLb2=0. Note that b2 cell (C4) is no longer a Solver Decision Variable.
(Click On Image To See a Larger Version)
Running the Solver produces the following calculation of MLLb2=0.
(Click On Image To See a Larger Version)
MLLm = MLLFull_Model = -6.6546
MLLb2=0 = MLLReduced_Model = -9.5059
Likelihood RatioReduced_Model = -2*MLLReduced_Model + 2*MLLFull_Model
Likelihood Ratio b1 = LR b2 = -2*MLLb2=0 + 2*MLLm
LR b2 = 5.7025
This statistic is distributed approximately according to the Chi-Square distribution with its degrees of freedom equal to the difference between the number of variables in the full model and the reduced model. In this case that difference is one variable so df = 1.
p Value = CHISQ.DIST.RT(LR b2,1) = CHISQ.DIST.RT(5.7025,1) = 0.0169
The very low p Value indicates that LR b2 is statistically significant. Variable b2 is therefore significant.
http://excelmasterseries.com/ClickBank/Thank_You_New_Manual_Order/Thank-You-For-Your-Order-Optimization.php
Need help with statistics? Assignmentwriting.services experts are here to provide you with professional assistance. We can do your homework, solve your class assignments and provide you with the most relevant answers.
ReplyDelete