Statistics for Beginners in Excel – ROC and Classification Table Data Analysis Tool

(Basic Statistics for Citizen Data Scientist)

ROC and Classification Table Data Analysis Tool

Real Statistics Data Analysis Tools: The Real Statistics Resource Pack supplies the ROC Curve and Classification Table data analysis tool which provides an easier way to construct the ROC curve and classification table. We show how this is done for Example 1 of Classification Table and ROC Curve.

The tool accepts two input formats: one with two columns (e.g. B24:C34 of Figure 1 of Classification Table) and another with three columns, which we illustrate in Figure 1.

 

ROC summary data

Figure 1 – Data input for ROC (three column format)

 

The format is similar to that in Figure 1 of ROC Curve except that only the upper bounds of the intervals are shown in column A. These correspond to the bins in the histograms, except that we now have two frequency columns (B and C) instead of just one.

To perform the analysis, press Ctrl-m and double-click on the ROC Curve and Classification Table data analysis tool. Fill in the dialog box that appears as shown in Figure 2.

 

ROC/Classification dialog box

Figure 2 – ROC Curve and Classification Table dialog box

 

Note that we choose a cutoff at the 5th row of the data by specifying the upper limit of failure range, namely 10. Also note that in Example 1 of ROC Curve we estimated the area under the ROC curve (AUC) via rectangles. This time we estimate AUC by using trapezoids instead.

After clicking on the OK button on the dialog box, the output shown in Figure 3 is produced.

 

ROC and Classification Table

Figure 3 – Output from ROC Curve and Classification Table data analysis tool

 

In addition, the output also includes the ROC curve shown in Figure 1 of ROC Curve.

The classification table is identical to that shown in 1 of Classification Table and the ROC Table with the exception of the AUC values is the same as that shown in Figure 1 of ROC Curve. The AUC values are slightly different since the area under the ROC curve is estimated via trapezoids instead of rectangles.

Observation: If we use the data from Figure 1 of Classification Table as input then we would insert B24:C34 (two-column format) from Figure 1 of Classification Table into the Input Range field in the dialog box in Figure 2 and set the Cutoff to 5 (5th row). The output would be the same as that described above.

Observation: The output shown in Figure 3 also includes the 95% confidence interval for the AUC (range M12:M15). Note that the 95% corresponds to an alpha value of .05 in cell M12 since 95% = 1 – .05. You can change the alpha value in M12 of the output and the corresponding confidence interval will change automatically (e.g. inserting .01 in M12 will generate the 99% confidence interval in M14 and M15).

 

 

Statistics for Beginners in Excel – ROC and Classification Table Data Analysis Tool

Personal Career & Learning Guide for Data Analyst, Data Engineer and Data Scientist

Applied Machine Learning & Data Science Projects and Coding Recipes for Beginners

A list of FREE programming examples together with eTutorials & eBooks @ SETScholars

95% Discount on “Projects & Recipes, tutorials, ebooks”

Projects and Coding Recipes, eTutorials and eBooks: The best All-in-One resources for Data Analyst, Data Scientist, Machine Learning Engineer and Software Developer

Topics included: Classification, Clustering, Regression, Forecasting, Algorithms, Data Structures, Data Analytics & Data Science, Deep Learning, Machine Learning, Programming Languages and Software Tools & Packages.
(Discount is valid for limited time only)

Disclaimer: The information and code presented within this recipe/tutorial is only for educational and coaching purposes for beginners and developers. Anyone can practice and apply the recipe/tutorial presented here, but the reader is taking full responsibility for his/her actions. The author (content curator) of this recipe (code / program) has made every effort to ensure the accuracy of the information was correct at time of publication. The author (content curator) does not assume and hereby disclaims any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from accident, negligence, or any other cause. The information presented here could also be found in public knowledge domains.

Learn by Coding: v-Tutorials on Applied Machine Learning and Data Science for Beginners

Please do not waste your valuable time by watching videos, rather use end-to-end (Python and R) recipes from Professional Data Scientists to practice coding, and land the most demandable jobs in the fields of Predictive analytics & AI (Machine Learning and Data Science).

The objective is to guide the developers & analysts to “Learn how to Code” for Applied AI using end-to-end coding solutions, and unlock the world of opportunities!