Hits: 35
Applied Data Science Coding in Python: How to normalise data
Normalizing data is a way of scaling the data so that it falls within a specific range. The most commonly used range is between 0 and 1. Normalizing data is useful in situations where the scale of the data can affect the performance of a machine learning algorithm. The process of normalizing involves transforming the data so that the minimum value is 0 and the maximum value is 1. This can be done by subtracting the minimum value from each data point and then dividing by the range (maximum value – minimum value).
Normalizing can be done using different techniques such as Min-Max normalization, L1 normalization and L2 normalization. Min-Max normalization is the most common method used in normalizing data. To normalize data in python using scikit-learn, you can use the MinMaxScaler
class from the sklearn.preprocessing
module. This class has a fit_transform()
method that can be used to normalize the data. It takes the input data as an argument and returns the normalized data.
It’s important to note that normalization should be done after handling missing values and before splitting the dataset into train and test sets.
In this Applied Machine Learning & Data Science Recipe, the reader will learn: How to standardize Data.
Applied Data Science Coding in Python: How to normalise data
Free Machine Learning & Data Science Coding Tutorials in Python & R for Beginners. Subscribe @ Western Australian Center for Applied Machine Learning & Data Science.
Western Australian Center for Applied Machine Learning & Data Science – Membership
Sign up to get end-to-end “Learn By Coding” example.
2000+ End-to-End Python & R Notebooks – only $19.95 (Less than 2 Udemy courses)
Introduction to Applied Machine Learning & Data Science for Beginners, Business Analysts, Students, Researchers and Freelancers with Python & R Codes @ Western Australian Center for Applied Machine Learning & Data Science (WACAMLDS) !!!
Latest end-to-end Learn by Coding Projects (Jupyter Notebooks) in Python and R:
Applied Statistics with R for Beginners and Business Professionals
Data Science and Machine Learning Projects in Python: Tabular Data Analytics
Data Science and Machine Learning Projects in R: Tabular Data Analytics
Python Machine Learning & Data Science Recipes: Learn by Coding
How to present Hierarchical Data in Pandas DataFrame in Python
How to generate time series data using Python and Seaborn package