Monday, June 2, 2014

t-Distribution’s PDF in Excel 2010 and Excel 2013

This is one of the following three articles about the t distribution in Excel

Overview of the t Distribution

t Distribution’s PDF (Probability Density Function) in Excel 2010 and Excel 2013

t Distribution’s CDF (Cumulative Distribution Function) in Excel 2010 and Excel 2013

 

t-Distribution’s PDF

As with all distributions, the t-Distribution has a PDF (Probability Density Function) and a CDF (Cumulative Distribution Function).

The t-Distribution’s PDF (Probability Density Function) equals the probability that sample point X taken from a normally distributed population is a certain distance from the sample mean for the given sample size.

More specifically, the t-Distribution’s PDF calculates the probability that the standardized distance between sample point X and the sample mean is equal to the t Value of t for sample of size n when n = df + 1 = v + 1.

t equals the t Value, which is the number of standard errors that a sampled point X is from the sample mean.

The t-Distribution’s PDF is expressed as f(t,v).

f(t,v) = the probability that a sample point is a distance of t standard errors from the sample mean if the sample size n = v + 1 and the population is normally-distributed. It is given by the following formula:

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function
(Click On Image To See a Larger Version)

In Excel 2010 and beyond, the t-Distribution’s PDF can be calculated directly be the following Excel formula:

f(t,v) = T.DIST(t, v, FALSE)

Prior to Excel 2010, Excel had no formula that correctly calculated the PDF of the t-Distribution. The t-Distribution’s PDF had to be calculated directly from the actual formula. The formula is somewhat onerous by itself but is much more manageable if broken down into its component parts as follows:

The full formula for the t-Distribution’s PDF is the following:

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function
(Click On Image To See a Larger Version)

This is made much more manageable if broken down into the following parts:

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function
(Click On Image To See a Larger Version)

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function
(Click On Image To See a Larger Version)

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function
(Click On Image To See a Larger Version)

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function
(Click On Image To See a Larger Version)

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function
(Click On Image To See a Larger Version)

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function
(Click On Image To See a Larger Version)

Γ(X) is the gamma distribution function and has to be calculated in Excel is a somewhat indirect fashion as follows:

Γ(X) = EXP(GAMMALN(X))

t equals the number of standard errors that sample point X is from the sample mean.

 

t-Distribution PDF Example With

Sample Data

The following is a random sample of 20 data points drawn from a population. The population standard deviation is not known. The population’s distribution is not known. It will not be assumed to be normally distributed.

This exercise will calculate the probability that a single sample point from this sample has the value of X = 48.

If the population or the sample can be shown to be normally distributed, the t-Distribution can be used to this probability, which would be the t-Distribution’s PDF at sample X = 48. The t-Distribution would be used because the sample size is small (n < 30) and the population’s standard deviation is not known.

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function
(Click On Image To See a Larger Version)

The t-Distribution should only be used if the sample or the population is normally distributed. The distribution of the population is not known so the sample’s distribution needs to be evaluated for normality. A histogram of the sample data indicates a reasonable resemblance to normality. The t-Distribution can therefore be used to analyze the sample.

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function (Click On Image To See a Larger Version)

This histogram was created in Excel inputting the following information into Excel’s Histogram dialogue box:

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function (Click On Image To See a Larger Version)

The next step is to run Excel Descriptive Statistics on the sample data. The output of this Excel data analysis tools is as follows:

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function
(Click On Image To See a Larger Version)

The t-Distribution is a family of distribution curves that are all symmetrical about a mean of zero. Each unique t-Distribution curve is completely described and differentiated from all other t-Distribution curves by only a single parameter: its degrees of freedom. Degrees of freedom equal sample size minus one as follows:

n = 20

df = v = n – 1 = 19

Note also that the t-Distribution’s PDF formula calculates a point’s PDF at the t Value of the sample point, not at the sample point’s actual value, X. The inputs to the t-Distribution’s PDF formula are the degrees of freedom and the point’s t Value, t.

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function
(Click On Image To See a Larger Version)

The t Value of a point is the distance that the point is from the sample mean expressed as the number of standard errors. The length of one standard error is calculated from the sample as follows:

SE = Standard Error = Standard Deviation / SQRT(n)

SE = 7.434 / SQRT(20) = 1.662

Note that SE is also automatically calculated in the Excel Descriptive Statistics.

The t Value of point X = 48 can now be calculated since the sample mean (x_bar) and SE are known. This is done as follows:

X = 48

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function
(Click On Image To See a Larger Version)

t = (48 – 50) / 1.662

t = -1.20337

The two inputs required to calculate the t-Distribution’s PDF are the point’s t Value (t = -1.20337) and the degrees of freedom (df = v = 19) from the sample.

Excel 2010 provides the following formula that enables the calculation of the t-Distribution’s PDF in one quick step. That calculation is as follows:

T.DIST(t, df, FALSE) =

T.DIST(-1.20337, 19, FALSE) = 0.18888

This results states that there is an 18.888 percent chance the one of the data observations from the sample group of 20 samples taken from a normally distributed population will have a t Value of -1.20337, i.e., will be 1.20337 standard errors to the left of the sample mean.

Given that the sample mean is 50, the length of one standard error = 1.662, and the d point is 1.20337 standard errors to the left of the mean, the point being evaluated is equal to the following:

X = x_bar + t Value*SE

X = 50 + (-1.20337)*(1.662) = 48

The t-Distribution’s PDF formula calculated that the probability that a single point of the 20 sampled points has a value of X = 48 (or equivalently has a t Value of t = -1.20337) is 18.888 percent.

Prior to Excel 2010, there was no Excel formula that calculated the correct PDF value for the t-Distribution. This calculation has to be performed manually in any version of Excel prior to 2010. This is done as follows (it’s really not as bad as it looks):

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function
(Click On Image To See a Larger Version)

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function]
(Click On Image To See a Larger Version)

a = 1 / SQRT(19 * 3.14159)

a = 0.12943

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function
(Click On Image To See a Larger Version)

b = EXP(GAMMALN((19+1)/2)) = 362,880

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function
(Click On Image To See a Larger Version)

c = EXP(GAMMALN(19/2)) = 119,292

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function
(Click On Image To See a Larger Version)

d = 1 + (-1.20337)2/19 = 1.0762

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function
(Click On Image To See a Larger Version)

e = -( (19+1) / 2) = -10

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function
(Click On Image To See a Larger Version)

f(t=-1.20337,v=19) = f(-1.20337,19) = 0.12943 * (362,880/119,292) * 1.0762-10

f(t=-1.20337,v=19) = 0.18888

This is the same result that is arrived at in the following single calculation from Excel 2010:

T.DIST(t, df, FALSE) =

T.DIST(-1.20337, 19, FALSE) = 0.18888

An Excel-generated graphical representation of this example is shown in the following diagram. The t Value of t = -1.20337 represents the point that is 1.20337 standard errors from the t-Distribution’s mean of zero.

Note that the t Value of t = -1.20337 has a PDF value equal to f(t=-1.2037,v=19) = 0.18888 on the Y axis. This indicates that there is an 18.888 percent probability that one of the 20 samples has a t Value of 1.20337. This is shown in the following Excel-generated graph:

t-distribution, t distribution, normal distribution, excel, excel 2010, excel 2013, statistics,pdf,probability density function (Click On Image To See a Larger Version)

 

Excel Master Series Blog Directory

 

Click Here To See a List Of All

Statistical Topics And Articles In

This Blog

 

You Will Become an Excel Statistical Master!

No comments:

Post a Comment