How to normalise COLUMN in Pandas DataFrame in Python

Hits: 108

How to normalise COLUMN in Pandas DataFrame in Python

Normalizing a column in a Pandas DataFrame in Python can be done by scaling the values in the column to be between 0 and 1. This is also known as min-max scaling.

First, you need to import the Pandas library and create a DataFrame. For example, you can create a DataFrame with random numbers using the numpy library.


import pandas as pd

import numpy as np

df = pd.DataFrame({'data': np.random.randint(0,100,size=(100, 1))})

To normalize a column, you can use the MinMaxScaler function from the sklearn library. This function can be used to scale the values of a column to be between 0 and 1. You need to import the MinMaxScaler from sklearn.preprocessing first.


from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()

Then you need to select the column you want to normalize and fit the scaler on it.


x = df['data'].values.reshape(-1, 1)


Finally, you can apply the transformation to the column using the transform() function.


df['data'] = scaler.transform(x)


Alternatively, you can also use the inbuilt function of pandas called as MinMaxScaler() and put the column in it.


df['data'] = (df['data'] - df['data'].min()) / (df['data'].max() - df['data'].min())


By doing this, you have now normalized the column “data” in the DataFrame between 0 and 1. You can repeat the same process for other columns in your DataFrame as well. Normalizing columns is an important step in data preprocessing, especially when you’re working with machine learning algorithms. This can help the algorithm to work better and converge faster.


In this Learn through Codes example, you will learn: How to normalise COLUMN in Pandas DataFrame in Python.


Personal Career & Learning Guide for Data Analyst, Data Engineer and Data Scientist

Applied Machine Learning & Data Science Projects and Coding Recipes for Beginners

A list of FREE programming examples together with eTutorials & eBooks @ SETScholars

95% Discount on “Projects & Recipes, tutorials, ebooks”

Projects and Coding Recipes, eTutorials and eBooks: The best All-in-One resources for Data Analyst, Data Scientist, Machine Learning Engineer and Software Developer

Topics included: Classification, Clustering, Regression, Forecasting, Algorithms, Data Structures, Data Analytics & Data Science, Deep Learning, Machine Learning, Programming Languages and Software Tools & Packages.
(Discount is valid for limited time only)

Disclaimer: The information and code presented within this recipe/tutorial is only for educational and coaching purposes for beginners and developers. Anyone can practice and apply the recipe/tutorial presented here, but the reader is taking full responsibility for his/her actions. The author (content curator) of this recipe (code / program) has made every effort to ensure the accuracy of the information was correct at time of publication. The author (content curator) does not assume and hereby disclaims any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from accident, negligence, or any other cause. The information presented here could also be found in public knowledge domains.

Learn by Coding: v-Tutorials on Applied Machine Learning and Data Science for Beginners