Hits: 108
How to normalise COLUMN in Pandas DataFrame in Python
Normalizing a column in a Pandas DataFrame in Python can be done by scaling the values in the column to be between 0 and 1. This is also known as min-max scaling.
First, you need to import the Pandas library and create a DataFrame. For example, you can create a DataFrame with random numbers using the numpy library.
import pandas as pd
import numpy as np
df = pd.DataFrame({'data': np.random.randint(0,100,size=(100, 1))})
To normalize a column, you can use the MinMaxScaler function from the sklearn library. This function can be used to scale the values of a column to be between 0 and 1. You need to import the MinMaxScaler from sklearn.preprocessing first.
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
Then you need to select the column you want to normalize and fit the scaler on it.
x = df['data'].values.reshape(-1, 1) scaler.fit(x)
Finally, you can apply the transformation to the column using the transform() function.
df['data'] = scaler.transform(x)
Alternatively, you can also use the inbuilt function of pandas called as MinMaxScaler() and put the column in it.
df['data'] = (df['data'] - df['data'].min()) / (df['data'].max() - df['data'].min())
By doing this, you have now normalized the column “data” in the DataFrame between 0 and 1. You can repeat the same process for other columns in your DataFrame as well. Normalizing columns is an important step in data preprocessing, especially when you’re working with machine learning algorithms. This can help the algorithm to work better and converge faster.
In this Learn through Codes example, you will learn: How to normalise COLUMN in Pandas DataFrame in Python.
Disclaimer: The information and code presented within this recipe/tutorial is only for educational and coaching purposes for beginners and developers. Anyone can practice and apply the recipe/tutorial presented here, but the reader is taking full responsibility for his/her actions. The author (content curator) of this recipe (code / program) has made every effort to ensure the accuracy of the information was correct at time of publication. The author (content curator) does not assume and hereby disclaims any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from accident, negligence, or any other cause. The information presented here could also be found in public knowledge domains.