This is one of the following seven articles on Multiple Linear Regression in Excel

Overview of Multiple

Variable Linear Regression

in Excel

Linear regression is a statistical technique used to model the relationship between one or more independent, explanatory variables and a single dependent variable. The linear regression type is classified as Simple Linear Regression if there is only a single explanatory variable. The regression type is classified as Multiple Linear Regression if there is more than one explanatory variable.

The Regression Equation

The end result of linear regression is a linear equation that models actual data as closely as possible. This equation is called the Regression Equation. The more linear the relationship is between each of the explanatory variables and the single dependent variable, the more closely the Regression Equation will model the actual data.

In the Regression Equation, the variable Y is usually designated as the single dependent variable. The independent explanatory variables are usually labeled X₁, X₂, …, X_k.

The Regression Equation for multiple regression appears as follows:

Y = b₀ + b₁X₁ + b₂X₂ + … + b_kX_k

The Regression Equation for simple regression appears as follows:

Y = b₀ + b₁X

b₀ is the Y-intercept of the Regression Equation.

b₁, b₂, ..,, b_k are the coefficients of the independent variables.

The most important part of regression analysis is the calculation of b₀, b₁, b₂, ..,, b_k in order to be able to construct the Regression Equation

Y = b₀ + b₁X for simple regression

Y = b₀ + b₁X₁ + b₂X₂ + … + b_kX_k for multiple regression.

Purposes of Linear Regression

Linear regression, both simple and multiple linear regression, generally have two main uses. They are as follows:

1) To quantify the linear relationship between the dependent variable and the independent variable(s) by calculating a regression equation.

2) To quantify how much of the movement or variation of the dependent variable is explained by the independent variable(s).

The Inputs For Linear Regression

The input data for linear regression analysis consists of a number of data records each having a single Y (dependent variable) value and one or more X (explanatory independent variable) values. Simple regression has only a single X value. Multiple regression has more than one X (independent) variable for each Y (dependent) variable.

Each data record occupies its own unique row in the regression input. Each data record contains the specific values of the input (independent) X variables that are associated with a specific value of the dependent Y variable shown in that data record.

The input data for multiple regression analysis appear as separate data records on each row as follows:

Y X₁ X₂ … X_k

4 6 10 …15

5 7 11 …16

6 8 12 …17

7 9 13 …18

8 10 14 …19

Multiple linear regression has more than one X (independent) variable. These independent variables (X’s) known as the explanatory, predictor, or regressor variables. The single dependent variable (Y) is the target or outcome variable.

Multiple linear regression requires that both the dependent variable and the independent variables be continuous. If ordinal data such as a Likert scale is used as a dependent or independent variable, it must be treated as a continuous variable that has equal distance between values. Ordinal data is normally defined data whose order matters but not the differences between values.

Null and Alternative Hypotheses

The Null Hypothesis of linear regression states that the coefficient(s) of the independent variable(s) in the regression equation equal(s) zero. The Alternative Hypothesis for linear regression therefore states that these coefficients do not equal zero.

For multiple linear regression this Null Hypothesis is expressed as follows:

H₀: b₁ = b₂ = … = b_k = 0

For simple linear regression this Null Hypothesis is expressed as follows:

H₀: b₁ = 0

b₁ is the slope of the regression line for simple regression.

The Alternative Hypothesis, H₁, for linear regression states that these coefficients do not equal zero.

The Y Intercept b₀ is not included in the Null Hypothesis.

X and Y Variables Must Have a

Linear Relationship

Linear regression is a technique that provides accurate information only if a linear relationship exists between the dependent variable and each of the independent variables. Independent variables that do not have a linear relationship with the dependent variable should not be included as inputs. An X-Y scatterplot diagram of between each independent variable and the dependent variable provides a good indication of whether the relationship is linear.

When data are nonlinear, there often two solutions available to allow regression analysis to be performed. They are the following:

1) Transform the nonlinear data to linear using a logarithmic transformation. This will not be discussed in this section.

2) Perform nonlinear regression on the data. One way to do that is to apply curve-fitting software that will calculate the mathematical equation that most closely models the data. Another section in this book will focus on using the Excel Solver to fit a curve to nonlinear data. The least-squares method is the simplest way to do this and will be employed in this section.

Do Not Extrapolate Regression

Beyond Existing Data

The major purpose of linear regression is to create a Regression Equation that accurately predicts a Y value based on a new set of independent, explanatory X values. The new set of X values should not contain any X values that are outside of the range of the X values used to create the original regression equation. The following simple example illustrates why a Regression Equation should not be extrapolated beyond the original X values.

Example of Why Regression Should Not Be Extrapolated

Imagine that the height of a boy was measured every month from when the boy was one year old until the boy was eighteen years old. The independent, explanatory X variable would the month number (12 months to 216 months) and the dependent y variable would be the height measured in inches. Typically most boys stopped growing in height when they reach their upper teens.

If the Regression Equation was created from the above data and then extrapolated to predict the boy’s height when he reached 50 years of age, the Regression Equation might predict that the boy would be fifteen feet tall.

Linear Regression Should Not

Be Done By Hand

Excel provides an excellent data analysis regression tool that can perform simple or multiple regression with equal ease. Doing the calculations by hand would be very tedious and provide lots of opportunities to make a mistake. Excel produces a very detailed output and the regression tool is run. I have recreated all of the simple regression calculations that Excel performs in this chapter. It will probably be clear from viewing this that it is wise to let Excel do the regression calculations. A number of statistics textbooks probably place too much emphasis on teaching the ability to perform the regression equations by hand. In the real world regression analysis would never be done manually.

The best way to understand multiple-variable linear regression is to perform an example in the following blog article.

Excel Master Series Blog Directory

Statistical Topics and Articles In Each Topic

Technorati Tags: excel,excel 2010,excel 2013,regression,multiple regression,statistics,linear regression,regression equation

Become an Excel Statistical Master

Excel Master Series - MBA-level statistics - Over 1,100+ Pages of Easy-To-Follow Instructions in Excel

It's a Full
Easy-To-Follow
MBA Course in Business Statistics

ALL IN EXCEL

&

MUCH Clearer

Than Your Text

Book

Download the
1,100+ Page Excel Statistical Master now

More Easy-To-

Follow eManuals

That You Will

Master Quickly

Step-By-Step Optimization With Excel Solver

What's In It?

For anyone who wants to be operating at a high level with the Excel Solver quickly, this is the book for you. Step-By-Step Optimization With Excel Solver is a 200+ page .pdf e-manual of simple yet thorough explanations on how to use the Excel Solver to solve today’s most widely known optimization problems. Loaded with screen shots that are coupled with easy-to-follow instructions, this book will simplify many difficult optimization problems and make you a master of the Excel Solver almost immediately.

Here are just some of the Solver optimization problems that are solved completely with simple-to-understand instructions and screen shots in this e-manual:

• The famous “Traveling Salesman” problem using Solver’s Alldifferent constraint and the Solver’s Evolutionary method to find the shortest path to reach all customers. This also provides an advanced use of the Excel INDEX function.

• The well-known “Knapsack Problem” which shows how optimize the use of limited space while satisfying numerous other criteria.

• How to perform nonlinear regression and curve-fitting on the Solver using the Solver’s GRG Nonlinear solving method.

• How to solve the “Cutting Stock Problem” faced by many manufacturing companies who are trying to determine the optimal way to cut sheets of material to minimize waste while satisfying customer orders.

• Portfolio optimization to maximize return or minimize risk.

• Venture capital investment selection using the Solver’s Binary constraint to maximize Net Present Value of selected cash flows at year 0. Clever use of the If-Then-Else statements makes this a simple problem.

• How use Solver to minimize the total cost of purchasing and shipping goods from multiple suppliers to multiple locations.

• How to optimize the selection of different production machine to minimize cost while fulfilling an order.

• How to optimally allocate a marketing budget to generate the greatest reach and frequency or number of inbound leads at the lowest cost.

Step-By-Step Optimization With Excel Solver has complete instructions and numerous tips on every aspect of operating the Excel Solver. You’ll fully understand the reports and know exactly how to tweek all of the Solver’s settings for total custom use. This e-manual also provides lots of inside advice and guidance on setting up the model in Excel so that it will be as simple and intuitive as possible to work with.

All of the optimization problems in this book are solved step-by-step using a 6-step process that works every time. In addition to detailed screen shots and easy-to-follow explanations on how to solve every optimization problem in the book, a link is provided to download an Excel workbook that has all problems completed exactly as they are in this e-manual.

Step-By-Step Optimization With Excel Solver is exactly the e-manual you need if you want to be optimizing at an advanced level with the Excel Solver quickly.

*******************

Become an Excel Statistical Master

It's a Full
Easy-To-Follow
MBA Course in Business Statistics

ALL IN EXCEL

&

MUCH Clearer

Than Your Text

Book

Download the
1,100+ Page Excel Statistical Master now

More Easy-To-

Follow eManuals

That You Will

Master Quickly

*******************

Become an Excel Statistical Master

It's a Full
Easy-To-Follow
MBA Course in Business Statistics

ALL IN EXCEL

&

MUCH Clearer

Than Your Text

Book

Download the
1,100+ Page Excel Statistical Master now

More Easy-To-

Follow eManuals

That You Will

Master Quickly

*******************

Become an Excel Statistical Master

It's a Full
Easy-To-Follow
MBA Course in Business Statistics

ALL IN EXCEL

&

MUCH Clearer

Than Your Text

Book

Download the
1,100+ Page Excel Statistical Master now

Immediate, Absolute, No-Questions-Asked, Money-Back Guarantee If Not TOTALLY, 100% Satisfied. In Other Words, If Any Excel Master Series eManual That You've Purchased Here Does Not Provide Instructions That Are CRYSTAL CLEAR and EASY TO UNDERSTAND, You Get All Of Your Money Back Immediately and Keep the eManual. Guaranteed!

Meet The Author