Month: January 2019

How to check model’s accuracy using Cross Validation in Python

How to check model’s accuracy using Cross Validation in Python When building a machine learning model, it’s important to evaluate its accuracy to make sure it’s performing well. One technique for doing this is called cross-validation. Cross-validation is a method that allows to test the model’s accuracy by dividing the data into several parts, training …

How to create TRAIN and TEST dataset using sklearn and Python

How to create TRAIN and TEST dataset using sklearn and Python When working with machine learning, it’s important to split the data into a training and testing set to evaluate the performance of your model. This is known as a train-test split, and it’s a common practice in machine learning. In Python, the library scikit-learn …

How to do variance thresholding in Python for feature selection

How to do variance thresholding in Python for feature selection When working with large datasets, it is often important to select the most important features that contribute to the prediction of a model. One technique for doing this is called variance thresholding. In Python, variance thresholding can be performed using the library scikit-learn. The first …

How to do recursive features elimination in Python using DecisionTreeRegressor

How to do recursive features elimination in Python using DecisionTreeRegressor Recursive feature elimination (RFE) is a technique used in machine learning to determine the most important features in a dataset. This is done by iteratively removing the least important feature until a certain number of features is reached. In Python, one can use the library …

How to drop out highly correlated features in Python

How to drop out highly correlated features in Python In machine learning, correlated features can cause problems because they can provide redundant information to the model. Having too many correlated features can also increase the risk of overfitting. One way to deal with correlated features is to drop some of them. This process is called …

How to select features using chi-squared in Python

How to select features using chi-squared in Python The Chi-Squared test is a statistical test that can be used to select features for a machine learning model. It tests the independence of two categorical variables by comparing the observed frequencies of the variables to the expected frequencies if they were independent. In Python, the Chi-Squared …

How to reduce dimensionality using PCA in Python

How to reduce dimensionality using PCA in Python Principal Component Analysis (PCA) is a technique for dimensionality reduction that is commonly used in machine learning and data analysis. It works by identifying the directions (principal components) in the data that have the most variation and projecting the data onto these directions. By doing so, it …

How to determine Spearman’s correlation in Python

How to determine Spearman’s correlation in Python Spearman’s correlation, also known as rank correlation, is a statistical method that is used to measure the strength of a monotonic relationship between two variables. It ranges from -1 to 1, where -1 indicates a strong negative correlation, 0 indicates no correlation, and 1 indicates a strong positive …