How to do STRING munging in Pandas DataFrame in Python

Hits: 52

How to do STRING munging in Pandas DataFrame in Python

String munging, also known as data cleaning or data wrangling, is the process of modifying and manipulating string values within a Pandas DataFrame in Python.

First, you need to import the Pandas library and create a DataFrame. For example, you can create a DataFrame with some sample data that contains string values.

import pandas as pd

data = {'product': ['Apple', 'Banana', 'Cherry', 'Date', 'Eggplant'],

'product_code': ['A1001', 'B1002', 'C1003', 'D1004', 'E1005'],

'description': ['Red Delicious', 'Cavendish', 'Bing', 'Medjool', 'Black Beauty']}

df = pd.DataFrame(data)

Next, you can use various string munging techniques to modify and manipulate the string values in the DataFrame.

For example, you can use the upper() or lower() function to convert the strings to uppercase or lowercase respectively:

df['product'] = df['product'].str.upper() 
df['description'] = df['description'].str.lower()

You can also use the len() function to determine the length of each string:

df['description_length'] = df['description'].str.len()

You can also use the replace() function to replace specific characters or substrings within a string:

df['product_code'] = df['product_code'].str.replace('1', '2')

You can also use the startswith() or endswith() function to check whether a string starts or ends with a specific character or substring:

df['starts_with_A'] = df['product'].str.startswith('A') 
df['ends_with_t'] = df['product'].str.endswith('t')

By using these string munging techniques, you can easily modify and manipulate string values within a Pandas DataFrame in Python. This can be useful for data cleaning and preprocessing, as it allows you to standardize the data and make it easier to work with.

 

In this Learn through Codes example, you will learn: How to do STRING munging in Pandas DataFrame in Python.



 

Personal Career & Learning Guide for Data Analyst, Data Engineer and Data Scientist

Applied Machine Learning & Data Science Projects and Coding Recipes for Beginners

A list of FREE programming examples together with eTutorials & eBooks @ SETScholars

95% Discount on “Projects & Recipes, tutorials, ebooks”

Projects and Coding Recipes, eTutorials and eBooks: The best All-in-One resources for Data Analyst, Data Scientist, Machine Learning Engineer and Software Developer

Topics included: Classification, Clustering, Regression, Forecasting, Algorithms, Data Structures, Data Analytics & Data Science, Deep Learning, Machine Learning, Programming Languages and Software Tools & Packages.
(Discount is valid for limited time only)

Disclaimer: The information and code presented within this recipe/tutorial is only for educational and coaching purposes for beginners and developers. Anyone can practice and apply the recipe/tutorial presented here, but the reader is taking full responsibility for his/her actions. The author (content curator) of this recipe (code / program) has made every effort to ensure the accuracy of the information was correct at time of publication. The author (content curator) does not assume and hereby disclaims any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from accident, negligence, or any other cause. The information presented here could also be found in public knowledge domains.

Learn by Coding: v-Tutorials on Applied Machine Learning and Data Science for Beginners