Month: January 2021

Statistics for Beginners in Excel – ROC Curve

(Basic Statistics for Citizen Data Scientist) ROC Curve The ROC Curve is a plot of values of the False Positive Rate (FPR) versus the True Positive Rate (TPR) for a specified cutoff value. Example 1: Create the ROC curve for Example 1 of Classification Table. We begin by creating the ROC table as shown on the left side …

Statistics for Beginners with Excel – Classification Table

(Basic Statistics for Citizen Data Scientist) Classification Table The Classification Table compares the predicted number of successes  to the number successes actually observed and similarly the predicted number of failures compared to the number actually observed. We have four possible outcomes: True Positives (TP) = the number of cases which were correctly classified to be positive, i.e. …

Statistics for Beginners with Excel – Dot Plots

(Basic Statistics for Citizen Data Scientist) Dot Plots A Dot Plot is another way to view data graphically. A dot plot is somewhat similar to a box plot, except that instead of summarizing the data in each group (the brands in Example 1 of Box Plots), the actual data values are plotted. Real Statistics Data Analysis Tool: To …

Statistics for Beginners with Excel – Creating Box Plots

(Basic Statistics for Citizen Data Scientist) Creating Box Plots in Excel Another way to characterize a distribution or a sample is via a box plot (aka a box and whiskers plot). Specifically, a box plot provides a pictorial representation of the following statistics: maximum, 75th percentile, median (50th percentile), mean, 25th percentile and minimum. Box plots are especially useful when comparing samples …

Statistics for Beginners with Excel – Histograms

(Basic Statistics for Citizen Data Scientist) Histograms A histogram is a graphical representation of the output of the FREQUENCY function (as described in Frequency Tables). Example 1: Create a histogram for the data and bin selection for Example 1 from Frequency Tables. We start by replicating the data and bin section for Example 1 in Figure 1. …

Statistics for Beginners with Excel – Frequency Tables

(Basic Statistics for Citizen Data Scientist) Frequency Tables Often data is presented in the form of a frequency table. For example, the data in range A4:A11 of Figure 1 can be expressed by the frequency table in range C4:D7. Figure 1 – Frequency Table The table in Figure 1 shows that the data element 2 occurs …

Statistics for Beginners with Excel – Descriptive Statistics Tools

(Basic Statistics for Citizen Data Scientist) Descriptive Statistics Tools Excel provides a data analysis tool called Descriptive Statistics which produces a summary of the key statistics for a data set. Example 1: Provide a table of the most common descriptive statistics for the scores in column A of Figure 1. Figure 1 – Output from Descriptive Statistics data …

Statistics for Beginners – Ranking Functions in Excel

(Basic Statistics for Citizen Data Scientist) Ranking Functions in Excel Excel Functions: Figure 1 summarizes the various ranking functions available in all versions of Excel for a data set R1. We describe each of these functions in more detail on the rest of this webpage, plus we describe additional ranking functions that are only available …

Statistics for Beginners – Symmetry, Skewness and Kurtosis

(Basic Statistics for Citizen Data Scientist) Symmetry, Skewness and Kurtosis We consider a random variable x and a data set S = {x1, x2, …, xn} of size n which contains possible values of x. The data set can represent either the population being studied or a sample drawn from the population. Looking at S as representing a distribution, the skewness of S is a measure of symmetry while kurtosis is …