Using Dummy
Independent Variable
Regression in Excel in 7
Steps To Perform Basic
Conjoint Analysis
Overview of Dummy Independent Variable Regression
Dummy independent variable regression is technique that allows linear regression to be performed when one or more of the input independent variables are categorical. Categorical variables cannot act as the input independent variables in a linear regression analysis is their current form as nominal variables. Nominal variables are simply categorical labels that provide no indication of relative value or importance.
The categorical variables can be used as inputs to a linear regression analysis if each categorical variable is converted dummy variables that are binary, i.e., can only take the value of either 1 or 0. The number of binary variables for each choice category will equal the number of choices available for that category.
One dummy variable from each choice category must be discarded as an input for the linear regression analysis. The values of independent variables of a regression should not be predictable based upon the values of other independent variables. Any error called multicollinearity occurs if the values any independent variables can be predicted from the values of any other independent variables.
If one level of each attribute is removed it is not possible to predict the values of the remaining dummy variables of each attribute. It does not matter which dummy variable from each choice category is removed. Removing one level of each attribute does not affect the accuracy of the regression analysis, as will be demonstrated at the end of this article.
The independent variables is a linear regression analysis can be both binary dummy variables and continuous variables. The number of choices for each category should be relatively few or the regression analysis will quickly become unmanageably large as a result of the large number of dummy variables that would be needed for a large number of choices for categories.
Dummy Dependent Variables
Linear regression can be performed if the independent variables are categorical by applying the dummy variable conversion described in this article. Linear regression cannot be performed if the dependent (Y) variable is categorical.
The simplest case of a categorical dependent variable is a binary dependent variable. An example might be an attempt to use independent variables to predict the outcome of a binary event, such as a potential customer making a purchase or not. The technique to be applied in this circumstance is called Binary Logistic Regression. Here is a link to a series of articles in this blog which explain how this technique can be performed in Excel:
http://blog.excelmasterseries.com/2014/06/logisticregressionoverview.html
Overview of Conjoint Marketing Analysis
Conjoint analysis is a statistical technique employed by market research to create an equation that can be used to predict the degree of preference that people have for different combinations of product attributes. Conjoint analysis also enables market researchers to determine the relative level of importance that consumers on attribute choice categories and on the individual choices available in each category.
A product can be described by the attribute choices available to the consumer. At its most basic level conjoint analysis requires that a test subject assign a preference rating to each of all of the possible combinations of attribute choices available for a product. The preference rating scale goes from 1 (lowest preference) to 10 (highest preference).
The information obtained from this consumer test can be directly analyzed with linear regression if the categorical choices are converted to binary dummy variables. The resulting binary dummy variables can be used part of the set of input independent variables.
The output of this linear regression analysis is a regression equation that can be used to predict the test respondent’s preference rating for any combination of attribute choices. The coefficients of the regression equation indicate the relative degree of importance that the test respondent places on each of the attribute choices.
The following describes the 7step process of using dummy independent variable regression to perform a very basic Conjoint analysis:
Step 1 – List All Attributes
List all of the available choices that a consumer has for one product. Starts by listing all of the overall attribute categories. In this case the attribute categories are brand, color, and price. Lists all of the available choices within each attribute category as follows:
Step 2 – List All Possible Combinations of Attributes
Every possible combination of attributes should be listed. In actual Conjoint Analysis each unique combination of attributed is place on a separate card.
Step 3 – Rate All Combinations
The test subject will then rate each combination on a scale of preference from1 to 10 with 10 being the most desirable. Placing each unique combination on a separate card facilitates the rating process.
Step 4 – Create Dummy Variables
In this step the categorical variables are converted to binary variables that can now as inputs to a linear regression analysis. Each level of each attribute will have its own binary dummy variable as shown below. The number of binary dummy variables for each attribute category will equal the number of choices available for that category. For example, there are three choices of brands with each choice being assigned to a single, binary dummy variable.
One dummy variable from each attribute category should be removed from the analysis. The values of independent variables of a regression should not be predictable based upon the values of other independent variables. Any error called multicollinearity occurs if the values any independent variables can be predicted from the values of any other independent variables.
If one level of each attribute is removed it is not possible to predict the values of the remaining dummy variables of each attribute. It does not matter which dummy variable from each choice category is removed. Removing one level of each attribute does not affect the accuracy of the regression analysis, as will be demonstrated at the end of this article.
The following are the listing of binary dummy variables for each of the attribute choice categories.
Step 5 – Arrange Data For Regression Analysis
The remaining dummy variables are input into the regression analysis as the independent variables while the preference rating is input as the dependent variable. Each record of data includes the binary dummy variables and preference rating from one of the cards. The data is arranged as follows:
Step 6 – Perform Regression in Excel
The Excel Regression dialogue box is then completed as follows:
(Click On Image To See a Larger Version)
Step 7 – Analyze Regression Output
The Excel regression output appears as follows:
(Click On Image To See a Larger Version)
The most important parts of the output are highlighted in the output and described as follows:

The regression equation is calculated to be the following:
Preference Rating = 5.61 + 1.67*(Brand B) + 3.5*(Brand C) + 1.33*(Blue) – 2.17*($100) – 4.17*($150)
The value of each of the dummy variables is either 1 or 0 from the input data for each data record.

The relatively high R Square, 0.87, indicates that the regression equation is a good predictor of Preference Rating. Approximately 87 percent of the variance of the Preference Rating is explained by the input variables.

The low Significance of F (which is a p Value) indicates that the overall regression equation is significant with a high degree of validity.

The low p Value for the Intercept and coefficients indicates that is significant with a high degree of validity.
Confirming the Validity of the Dummy Variable Regression Analysis Step
Plugging the values of the input independent variables for each data record creates the following comparison between the actual Preference Ratings given by the test subject and the Predicted Preference Ratings using the regression equation. The dummy variable regression analysis is seen to be relatively accurate. The removal of one dummy variable for each attribute choice category did not adversely affect the accuracy of the analysis.
The effect of removing a single dummy variable for each attribute choice category was to simply assign the value of 0 to coefficient that would be represented that dummy variable in the overall regression equation. The other coefficients have values relative to that value of 0.
The regression equation is shown by the Excel regression output to be the following:
Preference Rating = 5.61 + 1.67*(Brand B) + 3.5*(Brand C) + 1.33*(Blue) – 2.17*($100) – 4.17*($150)
If the dummy variables that were removed from the analysis would added back to the regression equation, the resulting equation would be the following:
Preference Rating = 5.61 + 0*(Brand A) + 1.67*(Brand B) + 3.5*(Brand C) + 0*(Red) + 1.33*(Blue) + 0*($50) – 2.17*($100) – 4.17*($150)
Both of the above regression equations would produce the same calculation of predicted Preference Rating.
The following image calculates the difference between the test respondent’s actual preference ratings for each combination and the preference ratings predicted by the regression equation.
(Click On Image To See a Larger Version)
(Click On Image To See a Larger Version)
Excel Master Series Blog Directory
Statistical Topics and Articles In Each Topic
 Histograms in Excel
 Bar Chart in Excel
 Combinations & Permutations in Excel
 Normal Distribution in Excel
 Overview of the Normal Distribution
 Normal Distribution’s PDF (Probability Density Function) in Excel 2010 and Excel 2013
 Normal Distribution’s CDF (Cumulative Distribution Function) in Excel 2010 and Excel 2013
 Solving Normal Distribution Problems in Excel 2010 and Excel 2013
 Overview of the Standard Normal Distribution in Excel 2010 and Excel 2013
 An Important Difference Between the t and Normal Distribution Graphs
 The Empirical Rule and Chebyshev’s Theorem in Excel – Calculating How Much Data Is a Certain Distance From the Mean
 Demonstrating the Central Limit Theorem In Excel 2010 and Excel 2013 In An EasyToUnderstand Way
 tDistribution in Excel
 Binomial Distribution in Excel
 zTests in Excel
 Overview of Hypothesis Tests Using the Normal Distribution in Excel 2010 and Excel 2013
 OneSample zTest in 4 Steps in Excel 2010 and Excel 2013
 2Sample Unpooled zTest in 4 Steps in Excel 2010 and Excel 2013
 Overview of the Paired (TwoDependentSample) zTest in 4 Steps in Excel 2010 and Excel 2013
 tTests in Excel
 Overview of tTests: Hypothesis Tests that Use the tDistribution
 1Sample tTests in Excel
 1Sample tTest in 4 Steps in Excel 2010 and Excel 2013
 Excel Normality Testing For the 1Sample tTest in Excel 2010 and Excel 2013
 1Sample tTest – Effect Size in Excel 2010 and Excel 2013
 1Sample tTest Power With G*Power Utility
 Wilcoxon SignedRank Test in 8 Steps As a 1Sample tTest Alternative in Excel 2010 and Excel 2013
 Sign Test As a 1Sample tTest Alternative in Excel 2010 and Excel 2013
 2IndependentSample Pooled tTests in Excel
 2IndependentSample Pooled tTest in 4 Steps in Excel 2010 and Excel 2013
 Excel Variance Tests: Levene’s, BrownForsythe, and F Test For 2Sample Pooled tTest in Excel 2010 and Excel 2013
 Excel Normality Tests KolmogorovSmirnov, AndersonDarling, and Shapiro Wilk Tests For TwoSample Pooled tTest
 TwoIndependentSample Pooled tTest  All Excel Calculations
 2 Sample Pooled tTest – Effect Size in Excel 2010 and Excel 2013
 2Sample Pooled tTest Power With G*Power Utility
 MannWhitney U Test in 12 Steps in Excel as 2Sample Pooled tTest Nonparametric Alternative in Excel 2010 and Excel 2013
 2 Sample Pooled tTest = SingleFactor ANOVA With 2 Sample Groups
 2IndependentSample Unpooled tTests in Excel
 2IndependentSample Unpooled tTest in 4 Steps in Excel 2010 and Excel 2013
 Variance Tests: Levene’s Test, BrownForsythe Test, and FTest in Excel For 2Sample Unpooled tTest
 Excel Normality Tests KolmogorovSmirnov, AndersonDarling, and ShapiroWilk For 2Sample Unpooled tTest
 2Sample Unpooled tTest Excel Calculations, Formulas, and Tools
 Effect Size for a 2IndependentSample Unpooled tTest in Excel 2010 and Excel 2013
 Test Power of a 2Independent Sample Unpooled tTest With GPower Utility
 Paired (2Sample Dependent) tTests in Excel
 Paired tTest in 4 Steps in Excel 2010 and Excel 2013
 Excel Normality Testing of Paired tTest Data
 Paired tTest Excel Calculations, Formulas, and Tools
 Paired tTest – Effect Size in Excel 2010, and Excel 2013
 Paired tTest – Test Power With GPower Utility
 Wilcoxon SignedRank Test in 8 Steps As a Paired tTest Alternative
 Sign Test in Excel As A Paired tTest Alternative
 Hypothesis Tests of Proportion in Excel
 Hypothesis Tests of Proportion Overview (Hypothesis Testing On Binomial Data)
 1Sample Hypothesis Test of Proportion in 4 Steps in Excel 2010 and Excel 2013
 2Sample Pooled Hypothesis Test of Proportion in 4 Steps in Excel 2010 and Excel 2013
 How To Build a Much More Useful SplitTester in Excel Than Google's Website Optimizer
 ChiSquare Independence Tests in Excel
 ChiSquare GoodnessOfFit Tests in Excel
 F Tests in Excel
 Correlation in Excel
 Pearson Correlation in Excel
 Spearman Correlation in Excel
 Confidence Intervals in Excel
 zBased Confidence Intervals of a Population Mean in 2 Steps in Excel 2010 and Excel 2013
 tBased Confidence Intervals of a Population Mean in 2 Steps in Excel 2010 and Excel 2013
 Minimum Sample Size to Limit the Size of a Confidence interval of a Population Mean
 Confidence Interval of Population Proportion in 2 Steps in Excel 2010 and Excel 2013
 Min Sample Size of Confidence Interval of Proportion in Excel 2010 and Excel 2013
 Simple Linear Regression in Excel
 Overview of Simple Linear Regression in Excel 2010 and Excel 2013
 Complete Simple Linear Regression Example in 7 Steps in Excel 2010 and Excel 2013
 Residual Evaluation For Simple Regression in 8 Steps in Excel 2010 and Excel 2013
 Residual Normality Tests in Excel – KolmogorovSmirnov Test, AndersonDarling Test, and ShapiroWilk Test For Simple Linear Regression
 Evaluation of Simple Regression Output For Excel 2010 and Excel 2013
 All Calculations Performed By the Simple Regression Data Analysis Tool in Excel 2010 and Excel 2013
 Prediction Interval of Simple Regression in Excel 2010 and Excel 2013
 Multiple Linear Regression in Excel
 Basics of Multiple Regression in Excel 2010 and Excel 2013
 Complete Multiple Linear Regression Example in 6 Steps in Excel 2010 and Excel 2013
 Multiple Linear Regression’s Required Residual Assumptions
 Normality Testing of Residuals in Excel 2010 and Excel 2013
 Evaluating the Excel Output of Multiple Regression
 Estimating the Prediction Interval of Multiple Regression in Excel
 Regression  How To Do Conjoint Analysis Using Dummy Variable Regression in Excel
 Logistic Regression in Excel
 Logistic Regression Overview
 Logistic Regression in 6 Steps in Excel 2010 and Excel 2013
 R Square For Logistic Regression Overview
 Excel R Square Tests: Nagelkerke, Cox and Snell, and LogLinear Ratio in Excel 2010 and Excel 2013
 Likelihood Ratio Is Better Than Wald Statistic To Determine if the Variable Coefficients Are Significant For Excel 2010 and Excel 2013
 Excel Classification Table: Logistic Regression’s Percentage Correct of Predicted Results in Excel 2010 and Excel 2013
 Hosmer Lemeshow Test in Excel – Logistic Regression GoodnessofFit Test in Excel 2010 and Excel 2013
 SingleFactor ANOVA in Excel
 Overview of SingleFactor ANOVA
 SingleFactor ANOVA in 5 Steps in Excel 2010 and Excel 2013
 ShapiroWilk Normality Test in Excel For Each SingleFactor ANOVA Sample Group
 KruskalWallis Test Alternative For Single Factor ANOVA in 7 Steps in Excel 2010 and Excel 2013
 Levene’s and BrownForsythe Tests in Excel For SingleFactor ANOVA Sample Group Variance Comparison
 SingleFactor ANOVA  All Excel Calculations
 Overview of PostHoc Testing For SingleFactor ANOVA
 TukeyKramer PostHoc Test in Excel For SingleFactor ANOVA
 GamesHowell PostHoc Test in Excel For SingleFactor ANOVA
 Overview of Effect Size For SingleFactor ANOVA
 ANOVA Effect Size Calculation Eta Squared in Excel 2010 and Excel 2013
 ANOVA Effect Size Calculation Psi – RMSSE – in Excel 2010 and Excel 2013
 ANOVA Effect Size Calculation Omega Squared in Excel 2010 and Excel 2013
 Power of SingleFactor ANOVA Test Using Free Utility G*Power
 Welch’s ANOVA Test in 8 Steps in Excel Substitute For SingleFactor ANOVA When Sample Variances Are Not Similar
 BrownForsythe FTest in 4 Steps in Excel Substitute For SingleFactor ANOVA When Sample Variances Are Not Similar
 TwoFactor ANOVA With Replication in Excel
 TwoFactor ANOVA With Replication in 5 Steps in Excel 2010 and Excel 2013
 Variance Tests: Levene’s and BrownForsythe For 2Factor ANOVA in Excel 2010 and Excel 2013
 ShapiroWilk Normality Test in Excel For 2Factor ANOVA With Replication
 2Factor ANOVA With Replication Effect Size in Excel 2010 and Excel 2013
 Excel Post Hoc Tukey’s HSD Test For 2Factor ANOVA With Replication
 2Factor ANOVA With Replication – Test Power With GPower Utility
 ScheirerRayHare Test Alternative For 2Factor ANOVA With Replication
 TwoFactor ANOVA Without Replication in Excel
 Randomized Block Design ANOVA in Excel
 RepeatedMeasures ANOVA in Excel
 SingleFactor RepeatedMeasures ANOVA in 4 Steps in Excel 2010 and Excel 2013
 Sphericity Testing in 9 Steps For Repeated Measures ANOVA in Excel 2010 and Excel 2013
 Effect Size For RepeatedMeasures ANOVA in Excel 2010 and Excel 2013
 Friedman Test in 3 Steps For RepeatedMeasures ANOVA in Excel 2010 and Excel 2013
 ANCOVA in Excel
 Normality Testing in Excel
 Creating a Box Plot in 8 Steps in Excel
 Creating a Normal Probability Plot With Adjustable Confidence Interval Bands in 9 Steps in Excel With Formulas and a Bar Chart
 ChiSquare GoodnessofFit Test For Normality in 9 Steps in Excel
 KolmogorovSmirnov, AndersonDarling, and ShapiroWilk Normality Tests in Excel
 Nonparametric Testing in Excel
 MannWhitney U Test in 12 Steps in Excel
 Wilcoxon SignedRank Test in 8 Steps in Excel
 Sign Test in Excel
 Friedman Test in 3 Steps in Excel
 ScheirerRayHope Test in Excel
 Welch's ANOVA Test in 8 Steps Test in Excel
 BrownForsythe F Test in 4 Steps Test in Excel
 Levene's Test and BrownForsythe Variance Tests in Excel
 ChiSquare Independence Test in 7 Steps in Excel
 ChiSquare GoodnessofFit Tests in Excel
 ChiSquare Population Variance Test in Excel
 Post Hoc Testing in Excel
 Creating Interactive Graphs of Statistical Distributions in Excel
 Interactive Statistical Distribution Graph in Excel 2010 and Excel 2013
 Interactive Graph of the Normal Distribution in Excel 2010 and Excel 2013
 Interactive Graph of the ChiSquare Distribution in Excel 2010 and Excel 2013
 Interactive Graph of the tDistribution in Excel 2010 and Excel 2013
 Interactive Graph of the tDistribution’s PDF in Excel 2010 and Excel 2013
 Interactive Graph of the tDistribution’s CDF in Excel 2010 and Excel 2013
 Interactive Graph of the Binomial Distribution in Excel 2010 and Excel 2013
 Interactive Graph of the Exponential Distribution in Excel 2010 and Excel 2013
 Interactive Graph of the Beta Distribution in Excel 2010 and Excel 2013
 Interactive Graph of the Gamma Distribution in Excel 2010 and Excel 2013
 Interactive Graph of the Poisson Distribution in Excel 2010 and Excel 2013
 Solving Problems With Other Distributions in Excel
 Solving Uniform Distribution Problems in Excel 2010 and Excel 2013
 Solving Multinomial Distribution Problems in Excel 2010 and Excel 2013
 Solving Exponential Distribution Problems in Excel 2010 and Excel 2013
 Solving Beta Distribution Problems in Excel 2010 and Excel 2013
 Solving Gamma Distribution Problems in Excel 2010 and Excel 2013
 Solving Poisson Distribution Problems in Excel 2010 and Excel 2013
 Optimization With Excel Solver
 Maximizing Lead Generation With Excel Solver
 Minimizing Cutting Stock Waste With Excel Solver
 Optimal Investment Selection With Excel Solver
 Minimizing the Total Cost of Shipping From Multiple Points To Multiple Points With Excel Solver
 Knapsack Loading Problem in Excel Solver – Optimizing the Loading of a Limited Compartment
 Optimizing a Bond Portfolio With Excel Solver
 Travelling Salesman Problem in Excel Solver – Finding the Shortest Path To Reach All Customers
 ChiSquare Population Variance Test in Excel
 Analyzing Data With Pivot Tables
 SEO Functions in Excel
 Time Series Analysis in Excel
 VLOOKUP
No comments:
Post a Comment