Month: January 2021

Statistics for Beginners in Excel – ROC and Classification Table Data Analysis Tool

Hits: 4 (Basic Statistics for Citizen Data Scientist) ROC and Classification Table Data Analysis Tool Real Statistics Data Analysis Tools: The Real Statistics Resource Pack supplies the ROC Curve and Classification Table data analysis tool which provides an easier way to construct the ROC curve and classification table. We show how this is done for Example 1 of Classification …

Statistics for Beginners in Excel – AUC Confidence Interval

Hits: 4 (Basic Statistics for Citizen Data Scientist) AUC Confidence Interval For large samples, AUC (area under the curve for a ROC curve) is approximately normally distributed, and so a 1-α confidence interval for AUC may be calculated as described in Confidence Interval for Sampling Distributions. The confidence interval is equal to AUC  ± se · zcrit where zcrit is the two-tailed …

Statistics for Beginners in Excel – ROC Curve

Hits: 9 (Basic Statistics for Citizen Data Scientist) ROC Curve The ROC Curve is a plot of values of the False Positive Rate (FPR) versus the True Positive Rate (TPR) for a specified cutoff value. Example 1: Create the ROC curve for Example 1 of Classification Table. We begin by creating the ROC table as shown on the …

Statistics for Beginners with Excel – Classification Table

Hits: 14 (Basic Statistics for Citizen Data Scientist) Classification Table The Classification Table compares the predicted number of successes  to the number successes actually observed and similarly the predicted number of failures compared to the number actually observed. We have four possible outcomes: True Positives (TP) = the number of cases which were correctly classified to be …

Statistics for Beginners with Excel – Dot Plots

Hits: 7 (Basic Statistics for Citizen Data Scientist) Dot Plots A Dot Plot is another way to view data graphically. A dot plot is somewhat similar to a box plot, except that instead of summarizing the data in each group (the brands in Example 1 of Box Plots), the actual data values are plotted. Real Statistics Data Analysis …

Statistics for Beginners with Excel – Creating Box Plots

Hits: 4 (Basic Statistics for Citizen Data Scientist) Creating Box Plots in Excel Another way to characterize a distribution or a sample is via a box plot (aka a box and whiskers plot). Specifically, a box plot provides a pictorial representation of the following statistics: maximum, 75th percentile, median (50th percentile), mean, 25th percentile and minimum. Box plots are especially useful when …

Statistics for Beginners with Excel – Histograms

Hits: 17 (Basic Statistics for Citizen Data Scientist) Histograms A histogram is a graphical representation of the output of the FREQUENCY function (as described in Frequency Tables). Example 1: Create a histogram for the data and bin selection for Example 1 from Frequency Tables. We start by replicating the data and bin section for Example 1 in …

Statistics for Beginners with Excel – Frequency Tables

Hits: 6 (Basic Statistics for Citizen Data Scientist) Frequency Tables Often data is presented in the form of a frequency table. For example, the data in range A4:A11 of Figure 1 can be expressed by the frequency table in range C4:D7. Figure 1 – Frequency Table The table in Figure 1 shows that the data element …

Statistics for Beginners with Excel – Descriptive Statistics Tools

Hits: 4 (Basic Statistics for Citizen Data Scientist) Descriptive Statistics Tools Excel provides a data analysis tool called Descriptive Statistics which produces a summary of the key statistics for a data set. Example 1: Provide a table of the most common descriptive statistics for the scores in column A of Figure 1. Figure 1 – Output from Descriptive …

Statistics for Beginners – Ranking Functions in Excel

Hits: 7 (Basic Statistics for Citizen Data Scientist) Ranking Functions in Excel Excel Functions: Figure 1 summarizes the various ranking functions available in all versions of Excel for a data set R1. We describe each of these functions in more detail on the rest of this webpage, plus we describe additional ranking functions that are …

Statistics for Beginners – Symmetry, Skewness and Kurtosis

Hits: 12 (Basic Statistics for Citizen Data Scientist) Symmetry, Skewness and Kurtosis We consider a random variable x and a data set S = {x1, x2, …, xn} of size n which contains possible values of x. The data set can represent either the population being studied or a sample drawn from the population. Looking at S as representing a distribution, the skewness of S is a measure of …

Year 1 – Math Worksheet 001 – Decomposition of Numbers within 10

Hits: 13   Year 1 – Mathematics Worksheet 001 – Decomposition of Numbers within 10     Free Machine Learning & Data Science Coding Tutorials in Python & R for Beginners. Subscribe @ Western Australian Center for Applied Machine Learning & Data Science. Western Australian Center for Applied Machine Learning & Data Science – Membership …

Statistics for Beginners – Measures of Variability

Hits: 9 (Basic Statistics for Citizen Data Scientist) Measures of Variability We consider a random variable x and a data set S = {x1, x2, …, xn} of size n which contains possible values of x. The data set can represent either the population being studied or a sample drawn from the population. The mean is the statistic used most often to characterize the center …

Statistics for Beginners – Measures of Central Tendency

Hits: 7 (Basic Statistics for Citizen Data Scientist) Measures of Central Tendency We consider a random variable x and a data set S = {x1, x2, …, xn} of size n which contains possible values of x. The data set can represent either the population being studied or a sample drawn from the population. We seek a single measure (i.e. a statistic) which …

Statistics for Beginners – Continuous Probability Distributions

Hits: 5 (Basic Statistics for Citizen Data Scientist) Continuous Probability Distributions We now extend the definition of probability distribution from discrete to continuous random variables. Such variables take on an infinite range of values even in a finite interval (weight of rice, room temperature, etc.). While for a discrete random variable x, the probability that x assumes a value …