How to normalise COLUMN in Pandas DataFrame in Python

How to normalise COLUMN in Pandas DataFrame in Python

Normalizing a column in a Pandas DataFrame in Python can be done by scaling the values in the column to be between 0 and 1. This is also known as min-max scaling.

First, you need to import the Pandas library and create a DataFrame. For example, you can create a DataFrame with random numbers using the numpy library.


import pandas as pd

import numpy as np

df = pd.DataFrame({'data': np.random.randint(0,100,size=(100, 1))})

To normalize a column, you can use the MinMaxScaler function from the sklearn library. This function can be used to scale the values of a column to be between 0 and 1. You need to import the MinMaxScaler from sklearn.preprocessing first.


from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()

Then you need to select the column you want to normalize and fit the scaler on it.


x = df['data'].values.reshape(-1, 1)


Finally, you can apply the transformation to the column using the transform() function.


df['data'] = scaler.transform(x)


Alternatively, you can also use the inbuilt function of pandas called as MinMaxScaler() and put the column in it.


df['data'] = (df['data'] - df['data'].min()) / (df['data'].max() - df['data'].min())


By doing this, you have now normalized the column “data” in the DataFrame between 0 and 1. You can repeat the same process for other columns in your DataFrame as well. Normalizing columns is an important step in data preprocessing, especially when you’re working with machine learning algorithms. This can help the algorithm to work better and converge faster.


In this Learn through Codes example, you will learn: How to normalise COLUMN in Pandas DataFrame in Python.

Find more … …

Applied Data Science Coding in Python: How to normalise data

Data Wrangling in Python – How to Normalize A Column In pandas

Data Transformation in R – How to do normalize transformation in R

Essential Gigs