Statistics for Beginners in Excel – Two Sample Hypothesis Testing to Compare Variances

Hits: 13

(Basic Statistics for Citizen Data Scientist)

Two Sample Hypothesis Testing to Compare Variances

Theorem 1 of F Distribution can be used to test whether the variances of two populations are equal, using the Excel functions and tools which follows. In order to deal exclusively with the right tail of the distribution, when taking ratios of sample variances from the theorem we should put the larger variance in the numerator of

image870

In order to use this test, the following must hold:

  • Both populations are normally distributed
  • Both samples are drawn independently from each other.
  • Within each sample, the observations are sampled randomly and independently of each other.

Excel Functions: The following Excel function can be used to carry out this test:

FTEST(R1, R2) = two-tailed F-test comparing the variances of the samples in ranges R1 and R2 = the two-tailed probability that the variance of the data in ranges R1 and R2 are not significantly different.

Thus FTEST(R1, R2) = 2 ∙ FDIST(x, df1, df2) where df1 = the number of elements in R1 – 1, df= the number of elements in R2 – 1 and x = var1 / var2 where var1 is the variance of the data in range R1 and var2 = the variance of the data in range R2. FTEST is a two-tail test, while FDIST and FINV are one-tailed.

Also FTEST(R1, R2) = FDIST(x, df1, df2) + FDIST(1/x, df2, df1), i.e. the sum of the right tail starting from x and the left tail starting from 1/x. This is true since FDIST(1/x, df2, df1) = 1 − FDIST(x, df1, df2).

This function ignores all empty and non-numeric cells.

Excel 2010/2013 also provide a new function F.TEST which is equivalent to FTEST (see Built-in Statistical Functions).

In addition Excel provides an F-Test Two-Sample for Variances data analysis tool which automates the process of comparing two variances.

Example 1: A company is comparing methods for producing pipes and wants to choose the method with the least variability. It has taken a sample of the lengths of the pipes using both methods as shown on the left side of Figure 1.

 

Compare variances ExcelFigure 1 – Excel’s two sample F-test to compare variances

 

We test the following null hypothesis:

H0σ1 – σ2 = 0 (equivalently: σ1 = σ2; i.e. both methods have the same variability)

and use the statisticimage5029with 11, 14 degrees of freedom, as described on the right side of Figure 1. Since this is a two-tail test, we note that

p-value = 2 * FDIST(F, df1, df2) = 2 * FDIST(1.85, 11, 14) = 0.279 > 0.05 = α

F-crit = FINV(α/2, df1, df2) = FINV(.025, 11, 14) = 3.09 > 1.85 = F

Either of the above tests shows there is no significant difference in the variance between the two methods with 95% confidence. Note that we needed to double the value for FDIST or halve α since this is a two-tail test.

Alternatively we can use FTEST which is a two-tail test:

FTEST(A4:A18, B4:B18) = .279 > 0.05 = α

We can also use the F-Test Two-Sample for Variances data analysis tool:

 

Variance comparison in Excel

Figure 2 – Comparing variances using Excel’s data analysis tool

 

This tool only performs a one-tail test, and so the p-value (0.1393) needs to be doubled to get 0.279, which is the same value we calculated in Figure 1. The critical value for F is calculated based on α = .05.

 

 

Statistics for Beginners in Excel – Two Sample Hypothesis Testing to Compare Variances

 

Sign up to get end-to-end “Learn By Coding” example.


 

 

Introduction to Applied Machine Learning & Data Science for Beginners, Business Analysts, Students, Researchers and Freelancers with Python & R Codes @ Western Australian Center for Applied Machine Learning & Data Science (WACAMLDS) !!!

Latest end-to-end Learn by Coding Projects (Jupyter Notebooks) in Python and R:

Applied Statistics with R for Beginners and Business Professionals

Data Science and Machine Learning Projects in Python: Tabular Data Analytics

Data Science and Machine Learning Projects in R: Tabular Data Analytics

Python Machine Learning & Data Science Recipes: Learn by Coding

R Machine Learning & Data Science Recipes: Learn by Coding

 

There are 2000+ End-to-End Python & R Notebooks are available to build Professional Portfolio as a Data Scientist and/or Machine Learning Specialist. All Notebooks are only $79.95. We would like to request you to have a look at the website for FREE the end-to-end notebooks, and then decide whether you would like to purchase or not.

Please do not waste your valuable time by watching videos, rather use end-to-end (Python and R) recipes from Professional Data Scientists to practice coding, and land the most demandable jobs in the fields of Predictive analytics & AI (Machine Learning and Data Science).

The objective is to guide the developers & analysts to “Learn how to Code” for Applied AI using end-to-end coding solutions, and unlock the world of opportunities!

 

Disclaimer: The information and code presented within this recipe/tutorial is only for educational and coaching purposes for beginners and developers. Anyone can practice and apply the recipe/tutorial presented here, but the reader is taking full responsibility for his/her actions. The author (content curator) of this recipe (code / program) has made every effort to ensure the accuracy of the information was correct at time of publication. The author (content curator) does not assume and hereby disclaims any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from accident, negligence, or any other cause. The information presented here could also be found in public knowledge domains.