Tag Archives: tabular data analytics

Beginner’s Project on Binary Classification in Python – Sonar Dataset

Beginner’s Project on Binary Classification in Python – Sonar Dataset Binary Classification is a type of machine learning problem where the goal is to classify instances into one of two classes. The Sonar Dataset is a popular dataset for binary classification problems, which is used to distinguish between metal cylinders and rocks from a sonar …

Beginner’s Project on Multi-Class Classification in Python

Beginner’s Project on Multi-Class Classification in Python Multi-class classification is a type of machine learning problem where the goal is to classify instances into one of multiple classes. This is different from binary classification, where the goal is to classify instances into one of two classes. There are several algorithms that can be used for …

How to create a pipeline that standardizes the data and create model

How to create a pipeline that standardizes the data and create model Creating a pipeline that standardizes the data and creates a model is a common task in machine learning. A pipeline is a sequence of steps that are executed in order to accomplish a certain task. In this case, the pipeline will be used …

Compare Machine Learning Algorithms with IRIS Dataset

Compare Machine Learning Algorithms with IRIS Dataset Comparing machine learning algorithms with the IRIS dataset in Python is a common task in machine learning, as it allows to evaluate the performance of different algorithms on a known dataset and choose the best one for a specific problem. The IRIS dataset is a popular dataset for …

Applied Data Science Coding in Python: Feature Extraction with Univariate Statistics

Applied Data Science Coding in Python: Feature Extraction with Univariate Statistics “Feature Extraction with Univariate Statistics” is a technique used in Python to select the most informative features from a dataset. It is based on the statistical tests for the presence of a relationship between each feature and the target variable. The feature selection is …

Applied Data Science Coding in Python: How to normalise data

Applied Data Science Coding in Python: How to normalise data Normalizing data is a way of scaling the data so that it falls within a specific range. The most commonly used range is between 0 and 1. Normalizing data is useful in situations where the scale of the data can affect the performance of a …

Applied Data Science Coding in Python: How to standardize Data

Applied Data Science Coding in Python: How to standardize Data Standardizing data in Python using scikit-learn is a way to bring all of the features in your dataset to the same scale. This is useful because some machine learning models can be sensitive to the scale of the input features. The process of standardizing involves …

Applied Data Science Coding in Python: How to do Binarization

Applied Data Science Coding in Python: How to do Binarization Binarization is the process of converting a continuous or numeric variable into a binary variable. The binary variable can take on only two values, for example, 0 and 1, true and false, or yes and no. This process is often used in machine learning and …

Applied Data Science Coding in Python: How to generate density plots

Applied Data Science Coding in Python: How to generate density plots Density plots, also known as probability density plots, are used to visualize the probability density function of a continuous random variable. It gives an idea of the distribution of the data and helps to identify patterns, such as skewness or outliers. In Python, there …

Applied Data Science Coding in Python: How to visualise data with Boxplot

Applied Data Science Coding in Python: How to visualise data with Boxplot A boxplot, also known as a box-and-whisker plot, is a powerful tool for visualizing the distribution of a dataset. It is particularly useful for identifying outliers and understanding the spread and skewness of the data. In Python, the matplotlib library provides several functions …