Tag Archives: descriptive statistics

Statistics for Beginners in Excel – Required Sample Size for the Binomial Testing

(Basic Statistics for Citizen Data Scientist) Required Sample Size for the Binomial Testing We now show how to determine the sample size required to achieve a specified power objective. Example 1: A company has made a major improvement in their manufacturing process and wants to test whether this improvement will result in 80% of the components …

Statistics for Beginners in Excel – Statistical Power for the Binomial Distribution

(Basic Statistics for Citizen Data Scientist) Statistical Power for the Binomial Distribution Power of one-tailed test Example 1: What is the power of the test in Example 3 of Hypothesis Testing for the Binomial Distribution? For this example we found 13 successes in a sample of size 24 and used a one-tailed test with α = .05 …

Statistics for Beginners in Excel – Poisson Distribution

(Basic Statistics for Citizen Data Scientist) Poisson Distribution Basic Concepts Definition 1: The Poisson distribution has a probability distribution function (pdf) given by The parameter μ is often replaced by λ. A chart of the pdf of the Poisson distribution for λ = 3 is shown in Figure 1.   Figure 1 – Poisson Distribution   Observation: Some key statistical properties of the Poisson …

Statistics for Beginners in Excel – Negative Binomial and Geometric Distributions

(Basic Statistics for Citizen Data Scientist) Negative Binomial and Geometric Distributions Negative Binomial Distribution Definition 1: Under the same assumptions as for the binomial distribution, let x be a discrete random variable. The probability density function (pdf) for the negative binomial distribution is the probability of getting x failures before k successes where p = the probability of success on any single trial. Thus …

Statistics for Beginners in Excel – Two-sample hypothesis testing

(Basic Statistics for Citizen Data Scientist) Two-sample Proportion Testing Theorem 1: Let x1 and x2 be random variables with proportional distributions with mean π1 and π2 respectively. Let p1 be the proportion of successes in n1 trials of the first distribution and let p2 be the proportion of successes in n2 trials of the second distribution. When the number of trials n1 and n2 are sufficiently large, usually when ni πi ≥ 5 and ni (1 –πi) ≥ 5, the difference between …

Statistics for Beginners in Excel – One-sample hypothesis testing

(Basic Statistics for Citizen Data Scientist) One-sample Proportion Testing From the theorem, we know that when sufficiently large samples of size n are taken, the distribution of sample proportions is approximately normal, distributed around the true population proportion mean π, with standard deviation (i.e. the standard error) We can use this fact to do hypothesis testing as was …

Statistics for Beginners in Excel – Hypothesis Testing for Binomial Distribution

(Basic Statistics for Citizen Data Scientist) Hypothesis Testing for Binomial Distribution Example 1: Suppose you have a die and suspect that it is biased towards the number three, and so run an experiment in which you throw the die 10 times and count that the number three comes up 4 times. Determine whether the die is biased. …

Statistics for Beginners in Excel – Binomial Distribution

(Basic Statistics for Citizen Data Scientist) Binomial Distribution Definition 1: Suppose an experiment has the following characteristics: the experiment consists of n independent trials, each with two mutually exclusive outcomes (success and failure) for each trial the probability of success is p (and so the probability of failure is 1 – p) Each such trial is called a Bernoulli trial. Let x be the discrete …

Statistics for Beginners in Excel – Tolerance Interval using Real Statistics

(Basic Statistics for Citizen Data Scientist) Tolerance Interval As described in Confidence Intervals, a confidence interval provides a way of estimating a population parameter by a corresponding sample statistic to a given level of confidence. We show how to estimate the population mean (the parameter) by the sample mean (the statistic). In particular, if the experiment is …

Statistics for Beginners in Excel – Identifying Outliers and Missing Data using Real Statistics

(Basic Statistics for Citizen Data Scientist) Identifying Outliers and Missing Data The Real Statistics Resource Pack provides an option for identifying potential outliers in a sample. Assuming the sample is normally distributed (based on the Central Limit Theorem), we know that NORM.S.DIST(-2.5,TRUE) = 0.621% of the data should have a z-score less than -2.5. Similarly …