Month: January 2021

Statistics for Beginners in Excel – Hypothesis Testing for Binomial Distribution

(Basic Statistics for Citizen Data Scientist) Hypothesis Testing for Binomial Distribution Example 1: Suppose you have a die and suspect that it is biased towards the number three, and so run an experiment in which you throw the die 10 times and count that the number three comes up 4 times. Determine whether the die is biased. …

Statistics for Beginners in Excel – Binomial Distribution

(Basic Statistics for Citizen Data Scientist) Binomial Distribution Definition 1: Suppose an experiment has the following characteristics: the experiment consists of n independent trials, each with two mutually exclusive outcomes (success and failure) for each trial the probability of success is p (and so the probability of failure is 1 – p) Each such trial is called a Bernoulli trial. Let x be the discrete …

Statistics for Beginners in Excel – Tolerance Interval using Real Statistics

(Basic Statistics for Citizen Data Scientist) Tolerance Interval As described in Confidence Intervals, a confidence interval provides a way of estimating a population parameter by a corresponding sample statistic to a given level of confidence. We show how to estimate the population mean (the parameter) by the sample mean (the statistic). In particular, if the experiment is …

Statistics for Beginners in Excel – Identifying Outliers and Missing Data using Real Statistics

(Basic Statistics for Citizen Data Scientist) Identifying Outliers and Missing Data The Real Statistics Resource Pack provides an option for identifying potential outliers in a sample. Assuming the sample is normally distributed (based on the Central Limit Theorem), we know that NORM.S.DIST(-2.5,TRUE) = 0.621% of the data should have a z-score less than -2.5. Similarly …

Statistics for Beginners in Excel – Power and Sample Size using Real Statistics

(Basic Statistics for Citizen Data Scientist) Power and Sample Size using Real Statistics Real Statistics Functions: The Real Statistics Resource Pack supplies the following functions for calculating the power and sample size requirements for one-sample and two-sample hypothesis testing of the mean using the normal distribution. NORM1_POWER(d, n, tails, α) = the power of a …

Statistics for Beginners in Excel – Sampling using Real Statistics

(Basic Statistics for Citizen Data Scientist) Sampling Excel provides a Sampling data analysis tool that can be used to create samples. The tool works by defining the population as an array in an Excel worksheet and then using the following input parameters to determine how you would like to carry out the sampling. Input Range – Specify …

Statistics for Beginners in Excel – Simulation using Real Statistics

(Basic Statistics for Citizen Data Scientist) Simulation It is often useful to create a model using simulation. Usually, this takes the form of generating a series of random observations (often based on a specific statistical distribution) and then studying the resulting observations using techniques described throughout the rest of this website. This approach is commonly called Monte …

Statistics for Beginners in Excel – Comparing two means when variances are known

(Basic Statistics for Citizen Data Scientist) Comparing two means when variances are known Theorem 1: Let x̄ and ȳ be the means of two samples of size nx and ny respectively. If x and y are normal or nx and ny are sufficiently large for the Central Limit Theorem to hold, then x̄ – ȳ has normal distribution with mean μx – μy and standard deviation Proof: Since the samples are random, x̄ and ȳ are normally and independently distributed. By the Central Limit …

Statistics for Beginners in Excel – Hypothesis Testing using the Central Limit Theorem

(Basic Statistics for Citizen Data Scientist) Hypothesis Testing using the Central Limit Theorem Using the Central Limit Theorem we can extend the approach employed in Single Sample Hypothesis Testing for normally distributed populations to those that are not normally distributed. Suppose we take a sample of size n, where n is sufficiently large, and pose a null hypothesis that the …