Tag Archives: descriptive statistics

IRIS Flower Classification using SKLEARN RandomForest Classifier with Monte Carlo Cross Validation

  In this Applied Machine Learning & Data Science Recipe (Jupyter Notebook), the reader will find the practical use of applied machine learning and data science in Python programming: IRIS Flower Classification using SKLEARN RandomForest Classifier with Monte Carlo Cross Validation.   Personal Career & Learning Guide for Data Analyst, Data Engineer and Data Scientist …

IRIS Flower Classification using SKLEARN DecisionTree Classifier with Monte Carlo Cross Validation

IRIS Flower Classification using SKLEARN DecisionTree Classifier with Monte Carlo Cross Validation   The IRIS flower is a popular example in the field of machine learning. It is a type of flower that has different variations, such as the setosa, virginica, and versicolor. In this blog, we will be discussing how to classify the IRIS …

IRIS Flower Classification using SKLEARN DecisionTree Classifier with Grid Search Cross Validation

IRIS Flower Classification using SKLEARN DecisionTree Classifier with Grid Search Cross Validation     The IRIS flower is a popular example in the field of machine learning. It is a type of flower that has different variations, such as the setosa, virginica, and versicolor. In this blog, we will be discussing how to classify the …

End-to-End Machine Learning: statistical significance of difference between model predictions

  End-to-End Machine Learning: statistical significance of difference between model predictions In this Applied Machine Learning & Data Science Recipe (Jupyter Notebook), the reader will find the practical use of applied machine learning and data science in R programming: End-to-End Machine Learning: statistical significance of difference between model predictions.    End-to-End Machine Learning: statistical …

End-to-End Machine Learning: model selection in R using summary results

End-to-End Machine Learning: model selection in R using summary results When training multiple machine learning models, it’s important to select the best one to use on new, unseen data. One way to do this is by using a tabular method called “summary results”. Summary results are a collection of statistics that summarize the performance of …

Non-Linear Regression in R – gradient boosted machine in R

Non-Linear Regression in R – gradient boosted machine in R Non-linear regression is a type of statistical analysis that is used to model relationships between variables that are not linear. In other words, it is used to model relationships where the change in one variable is not directly proportional to the change in another variable. …

Data Cleaning in R – remove duplicate values in R

Data Cleaning in R – remove duplicate values in R Data cleaning is an important step in the data analysis process, and one of the tasks is often identifying and removing duplicate values. Duplicate values can occur for a variety of reasons, such as data entry errors or data being collected multiple times. These duplicate …

Visualize Multivariate Data – Scatter Matrix plot in R

Visualize Multivariate Data – Scatter Matrix plot in R A scatter matrix plot, also known as a pair plot, is a useful tool for visualizing the relationship between multiple variables in a dataset. It creates a matrix of scatter plots, with each variable being plotted against every other variable. This allows to quickly identify patterns …

Visualize Univariate Data – BOX plot in R

Visualize Univariate Data – BOX plot in R In R, a box plot is a useful tool for visualizing univariate data, or data that has only one variable. A box plot is a graph that uses boxes to represent the distribution of the data and to identify any potential outliers. To create a box plot …

Summarise Data in R – How to know datatypes in R

Summarise Data in R – How to know datatypes in R In R, it is important to know the data types of variables in a dataset, as different data types require different types of analysis and processing. The most common data types in R are numeric, character, and factor. To check the data types of …