Sunday, June 1, 2014

R Square For Logistic Regression Overview

This is one of the following seven articles on Logistic Regression in Excel

Logistic Regression Overview

Logistic Regression in 7 Steps in Excel 2010 and Excel 2013

R Square For Logistic Regression Overview

Excel R Square Tests: Nagelkerke, Cox and Snell, and Log-Linear Ratio in Excel 2010 and Excel 2013

Likelihood Ratio Is Better Than Wald Statistic To Determine if the Variable Coefficients Are Significant For Excel 2010 and Excel 2013

Excel Classification Table: Logistic Regression’s Percentage Correct of Predicted Results in Excel 2010 and Excel 2013

Hosmer- Lemeshow Test in Excel – Logistic Regression Goodness-of-Fit Test in Excel 2010 and Excel 2013

 

R Square For Logistic

Regression Overview

A reliable goodness-of-fit calculation is essential for any model. The measures of goodness-of-fit for linear regression are R Square and the related Adjusted R Square. These metrics calculated the percentage of total variance can be explained by the combined variance of the input variables since variances can added.

R Square is calculated for binary logistic regression in a different way. R Square in this case is based upon the difference in predictive ability of the logistic regression equation with and without the independent variables. This is sometimes referred to as pseudo R Square.

 

R Square For Logistic Regression

With Excel Solver Overview

 

Step 1) Calculate the Maximum Log-Likelihood for Full Model

The Maximum Log-Likelihood Function, MLL, is calculated for the full model. This has already been done by the Excel Solver in order to determine the constants b0, b1, b2, …, bk that create the most accurate P(X) equation. MLL for the full model is designated as MLLm. This has already been calculated to be the following:

MLLm = Maximum Log-Likelihood for Full Model

MLLm = -6.6545

 

Step 2) Calculate the Maximum Log-Likelihood for the Model With No Explanatory Variables

Calculating the Maximum Lob-Likelihood Function for the model with no explanatory variables is done by setting all constants (Solver Decision Variables) except b0 to zero before calculating the MLL.

The Maximum Log-Likelihood for the model with no explanatory variables (b1 = b2 = … = bk = 0) designated as MLL0.

The constant b0 is the Y Intercept of regression equation. This is the only constant that will be included in the calculation of MLL0. The other constants, b1, b2, …, bk, are the coefficients of the input variables X1, X2, … , Xk. Setting the constants b1, b2, …, bk to zero removes all explanatory variables X1, X2, … , Xk. The terms b1*X1, b2*X2, …, bk*Xk will now all equal to zero in the Logit (and therefore the logistic equation P(X)) no matter what the values of the input variables X1, X2, … , Xk are.

Constants b1 and b2 are set to zero as follows before running the Excel Solver to calculate MLL0:

logistic regression, regression, r square, excel, excel 2010, excel 2013, statistics
(Click On Image To See a Larger Version)

Below is the Solver dialogue box to calculate MLL0. Note that there is only one Solver Decision Variable (b0 in cell C2) that will be adjusted to find MLL0.

logistic regression, regression, r square, excel, excel 2010, excel 2013, statistics (Click On Image To See a Larger Version)

Running the Solver produced the following MLL0:

logistic regression, regression, r square, excel, excel 2010, excel 2013, statistics (Click On Image To See a Larger Version)

MLL0 = Maximum Log-Likelihood for Model With Only Intercept and No Explanatory Variables (b1 = b2 = … = bk = 0)

MLL0 = MLLb1=b2= ... =bk=0 = -13.8629

Calculating MML for the full model produced the following:

MLLm = Maximum Log-Likelihood for Full Model

MLLm = -6.6545

The three different ways to calculate R Square for logistic regression as performed in Excel in the following blog article. These three methods are Nagelkerke, Cox and Snell, and the Log-Linear Ratio.

 

Excel Master Series Blog Directory

Statistical Topics and Articles In Each Topic

 

No comments:

Post a Comment