## Tuesday, June 3, 2014

### Standard Normal Distribution in Excel 2010 and Excel 2013

This is one of the following eight articles on the normal distribution in Excel

Overview of the Normal Distribution

Normal Distribution’s PDF (Probability Density Function) in Excel 2010 and Excel 2013

Normal Distribution’s CDF (Cumulative Distribution Function) in Excel 2010 and Excel 2013

Solving Normal Distribution Problems in Excel 2010 and Excel 2013

Overview of the Standard Normal Distribution in Excel 2010 and Excel 2013

An Important Difference Between the t and Normal Distribution Graphs

The Empirical Rule and Chebyshev’s Theorem in Excel – Calculating How Much Data Is a Certain Distance From the Mean

Demonstrating the Central Limit Theorem In Excel 2010 and Excel 2013 In An Easy-To-Understand Way

# Overview of the Standard Normal Distribution

The normal distribution is actually a family of distributions. Each unique normal distribution can be fully described by its two parameters, which are the following:

1) Its population mean, µ, which is a location parameter.

2) Its population standard deviation, σ, which is a scale parameter.

The most basic normal distribution is called the Standard Normal Distribution. Its population mean, µ, equals 0 and its population standard deviation, σ, equals 1.

The PDF curve of the Standard Normal Distribution appears as follows in this Excel-generated graph: (Click On Image To See a Larger Version)

The CDF curve of the Standard Normal Distribution appears as follows in this Excel-generated graph: (Click On Image To See a Larger Version)

When the population mean, µ, equals 0 and the population standard deviation, σ, equals 1, the x values equal the number of standard deviations that each x is from the mean. In this situation, each x value equals its z score. A data point’s z score equals the number of population standard deviations that the point is from the population mean. The formula to calculate an x value’s z score is the following: (Click On Image To See a Larger Version)

Every unique normal distribution becomes to the Standard Normal Distribution when the x values are converted to z scores. The Standard Normal Distribution (with zero population mean and unit population standard deviation) is sometimes referred to as the standard Gaussian distribution or the unit normal distribution is denoted by the Greek letter “phi” as follows:

φ(x) (small “phi”) denotes the PDF of the standard normal distribution at x, which also equals z score(x) because σ = 1.

Φ(x) (capital “phi”) denotes the CDF of the standard normal distribution at x, which also equals z score(x) because σ = 1.

When µ = 0 and σ = 1, an X value is equal to its z score. If µ = 0 and σ = 1, then the Excel formulas NORM.DIST(X,µ,σ, TRUE or FALSE) can be replaced by the simpler Excel formula NORM.S.DIST(z, TRUE or FALSE)

## Calculating the PDF of a Standardized Normal Distribution in Excel

X = 2

µ = 0

σ = 1

f(X = 2, µ = 0, σ = 1) = NORM.DIST(2,0,1,FALSE) = 0.05399

f(X = 2, µ = 0, σ = 1) = NORM.S.DIST(z score(x),FALSE) (Click On Image To See a Larger Version)

z score(x) = (2 – 0)/1 = 2

φ(2) = f(X = 2, µ = 0, σ = 1)

f(X = 2, µ = 0, σ = 1) = NORM.S.DIST(2,FALSE) = 0.05399

φ(2) = 0.05399

This is shown in the following Excel-generated graph: (Click On Image To See a Larger Version)

## Calculating the CDF of a Standardized Normal Distribution in Excel

X = 2

µ = 0

σ = 1

F(X = 2, µ = 0, σ = 1) = NORM.DIST(2,0,1,TRUE) = 0.9773

F(X = 2, µ = 0, σ = 1) = NORM.S.DIST(z score(x),TRUE)

z score(x) = (2 – 0)/1 = 2

Φ(2) = F(X = 2, µ = 0, σ = 1)

F(X = 2, µ = 0, σ = 1) = NORM.S.DIST(2,TRUE) = 0.9773

Φ(2) = 0.9773

This is shown in the following Excel-generated graph: (Click On Image To See a Larger Version)

Here are some properties of the CDF of the standard normal distribution

Φ(-∞) = 0 = 0%

Φ(0) = 0.5 = 50%

Φ(∞) = 1 = 100%

Φ(X) = 1 - Φ(-X) and therefore Φ(X) + Φ(-X) = 100%

## The t-Distribution’s Convergence to the Standard Normal Distribution

The t-Distribution has the following two important similarities to the standard normal distribution:

1) Both the t-Distribution and the standard normal distribution are centered about means of 0. One difference between the t-Distribution and the normal distribution is that the normal distribution can assume any value as its mean. The t-Distribution is always symmetrical about a mean of 0, as is the standard normal distribution.

2) The horizontal axis of the t-Distribution is measured in units of t value. The t-value of a point is the number of sample standard errors that the point is from the mean. The horizontal axis of the standard normal distribution is measured in units of z Value, i.e., the number of population standard deviations that the point is from the mean. This is the result of the standard normal distribution’s population standard deviation being set to the unit value of 1. The formulas for z score(x) and t value(x) are shown as follows:

The z score of a randomly-sampled point, x, from a normally-distributed population is calculated as follows: (Click On Image To See a Larger Version)

The z score of point x taken from a large sample (n > 30) from a population of unknown distribution is calculated as follows: (Click On Image To See a Larger Version)

The t value of a point x taken from a small sample (n < 30) of a normally-distributed population or a large sample from a population of unknown distribution is calculated as follows: (Click On Image To See a Larger Version)

When sample size is large, the means of large, similar-sized random samples are normal-distributed regardless of the distribution of the underlying population as per the Central Limit Theorem.

The Standard Error (SE) that is calculated for the t value using the sample standard deviation, s, is an estimate of actual SE that would be calculated with the population standard deviation, σ.

The z Value (z score) is the unit of measure of the horizontal axis of the standard normal distribution and the t value is the unit of measure of the horizontal axis of the t-Distribution.

z score(x) is the number of population standard deviations that a point x is from the population mean. t value(x) is the number of standard errors that point x is from the sample mean.

It is important to note that z scores are created using population parameters µ (population mean) and σ (population standard deviation). t values are created using the sample statistics x_bar (sample mean), s (sample standard deviation), and n (sample size).

The underlying reason for this is that the normal distribution is used to analyze normally-distributed data only when population parameters µ and σ are known. The t-Distribution is used to analyze normally-distributed data when only sample statistics x_bar, s, and n are known. It is much more common in the real world that only sample statistics are known so the t-Distribution is often the tool of choice for analyzing normally-distributed data.

The t-Distribution has only a single parameter: v, the degrees of freedom, which is equal to v = n – 1. The t-Distribution’s shape changes as sample size, n, changes. Very low values of n (very small sample sizes) produce a t-Distribution PDF graph with wider tails and a lower peak. The follow Excel-generated graph shows a t-Distribution’s PDF curve with a sample size of n = 3: (Click On Image To See a Larger Version)

As sample increases, the t-Distribution’s shape changes as its peak rises and less weight remains in the outer tails. The t-Distribution converges to exactly resemble the standard normal distribution when the sample size is large enough. The follow Excel-generated graph shows a t-Distribution’s PDF curve with a sample size n approaches infinity: (Click On Image To See a Larger Version)

The PDF curve of the standard normal distribution shows an exact match: (Click On Image To See a Larger Version)

It should be noted that a number of texts incorrectly state the t-Distribution converges to the normal distribution as sample size increases. The normal distribution represents a family of distribution curves each having a unique combination of µ and σ. The specific normal curve needs to be specified. The correct statement would be that the t-Distribution converges to the standard normal distribution as sample size increases.

Excel Master Series Blog Directory

Excel Master Series Blog Directory