Day: February 1, 2021

Statistics for Beginners in Excel – Fisher’s Exact Test

(Basic Statistics for Citizen Data Scientist) Fisher’s Exact Test When the conditions for Pearson’s chi-square test are not met, especially when one of more of the cells have expi < 5, an alternative approach with 2 × 2 contingency tables is to use Fisher’s exact test. Since this method is more computationally intense, it is best used for smaller …

Statistics for Beginners in Excel – Independence Testing

(Basic Statistics for Citizen Data Scientist) Independence Testing The method described in Goodness of Fit can also be used to determine whether two sets of data are independent of each other. Such data are organized in what are called contingency tables, as described in Example 1. In these cases df = (row count – 1) (column count – 1). Excel …

Statistics for Beginners in Excel – Goodness of Fit

(Basic Statistics for Citizen Data Scientist) Goodness of Fit Basic Concepts Observation: Suppose the random variable x has binomial distribution B(n, p) and define z as By Corollary 1 of Relationship between Binomial and Normal Distributions, provided n is large enough, generally if np ≥ 5 and n(1–p) ≥ 5, then z is approximately normally distributed with mean 0 and standard deviation 1. Thus by Corollary 1 of Chi-square Distribution, z2 ~ …

Statistics for Beginners in Excel – Power of One Sample Variance Testing

(Basic Statistics for Citizen Data Scientist) Power of One Sample Variance Testing Let  represent the hypothetical variance and s2 the observed variance. Let x+crit be the right critical value (based on the null hypothesis with significance level α/2) and x-crit be the left critical value (two-tailed test) , i.e. x-crit = CHIINV(1−α/2,n−1)               x+crit = CHIINV(α/2,n−1) Let δ = /s2. Then the beta …

Statistics for Beginners in Excel – One Sample Hypothesis Testing of the Variance

(Basic Statistics for Citizen Data Scientist) One Sample Hypothesis Testing of the Variance Based on Theorem 2 of Chi-square Distribution and its corollaries, we can use the chi-square distribution to test the variance of a distribution. Example 1: A company produces metal pipes of a standard length. Twenty years ago it tested its production quality and found that …

Statistics for Beginners in Excel – Chi-square Distribution

(Basic Statistics for Citizen Data Scientist) Chi-square Distribution Definition 1: The chi-square distribution with k degrees of freedom, abbreviated χ2(k), has probability density function k does not have to be an integer and can be any positive real number. Click here for more technical details about the chi-square distribution, including proofs of some of the propositions described below. Except for the proof …

Statistics for Beginners in Excel – Equivalence Testing (TOST)

(Basic Statistics for Citizen Data Scientist) Equivalence Testing (TOST) The objective of a two-sample equivalence test is to determine whether the means of two populations are equivalent based on two independent samples from these populations; here “equivalent” means that the two means differ by a small pre-defined amount. This margin of equivalence is determined by …

Statistics for Beginners in Excel – Basic Probability Concepts

(Basic Statistics for Citizen Data Scientist) Coefficient of Variation Testing One Sample Testing In Measures of Variability, we describe the unitless measure of dispersion called the coefficient of variation. It turns out that s/x̄ is a biased estimator for the population coefficient of variation σ/μ. A nearly unbiased estimator is where n is the sample size. When the coefficient of variation …

Statistics for Beginners in Excel – Paired Sample t Test

(Basic Statistics for Citizen Data Scientist) Paired Sample t Test In paired sample hypothesis testing, a sample from the population is chosen and two measurements for each element in the sample are taken. Each set of measurements is considered a sample. Unlike the hypothesis testing studied so far, the two samples are not independent of one another. …

Statistics for Beginners in Excel – Two Sample t Test: unequal variances

(Basic Statistics for Citizen Data Scientist) Two Sample t Test: unequal variances Theorem 1: Let x̄ and ȳ be the sample means and sx and sy be the sample standard deviations of two sets of data of size nx and ny respectively. If x and y are normal, or nx and ny are sufficiently large for the Central Limit Theorem to hold, then the random variable has distribution  T(m) where Observation: The nearest integer to m can be …