Pandas Example – Write a Pandas program to remove the html tags within the specified column of a given DataFrame

Hits: 283

(Python Example for Beginners)

 

Write a Pandas program to remove the html tags within the specified column of a given DataFrame.

 

Sample Solution:

Python Code :


import pandas as pd
import re as re

df = pd.DataFrame({
    'company_code': ['Abcd','EFGF', 'zefsalf', 'sdfslew', 'zekfsdf'],
    'date_of_sale': ['12/05/2002','16/02/1999','05/09/1998','12/02/2022','15/09/1997'],
    'address': ['9910 Surrey <b>Avenue</b>','92 N. Bishop Avenue','9910 <br>Golden Star Avenue', '102 Dunbar <i></i>St.', '17 West Livingston Court']
})

print("Original DataFrame:")
print(df)

def remove_tags(string):
    result = re.sub('<.*?>','',string)
    return result

df['with_out_tags']=df['address'].apply(lambda cw : remove_tags(cw))

print("nSentences without tags':")
print(df)

Sample Output:

Original DataFrame:
  company_code             ...                                   address
0         Abcd             ...                 9910 Surrey Avenue
1         EFGF             ...                       92 N. Bishop Avenue
2      zefsalf             ...               9910 
Golden Star Avenue
3      sdfslew             ...                     102 Dunbar St.
4      zekfsdf             ...                  17 West Livingston Court

[5 rows x 3 columns]

Sentences without tags':
  company_code            ...                        with_out_tags
0         Abcd            ...                   9910 Surrey Avenue
1         EFGF            ...                  92 N. Bishop Avenue
2      zefsalf            ...              9910 Golden Star Avenue
3      sdfslew            ...                       102 Dunbar St.
4      zekfsdf            ...             17 West Livingston Court

[5 rows x 4 columns]

 

Pandas Example – Write a Pandas program to remove the html tags within the specified column of a given DataFrame

Sign up to get end-to-end “Learn By Coding” example.


Two Machine Learning Fields

There are two sides to machine learning:

  • Practical Machine Learning:This is about querying databases, cleaning data, writing scripts to transform data and gluing algorithm and libraries together and writing custom code to squeeze reliable answers from data to satisfy difficult and ill defined questions. It’s the mess of reality.
  • Theoretical Machine Learning: This is about math and abstraction and idealized scenarios and limits and beauty and informing what is possible. It is a whole lot neater and cleaner and removed from the mess of reality.
Disclaimer: The information and code presented within this recipe/tutorial is only for educational and coaching purposes for beginners and developers. Anyone can practice and apply the recipe/tutorial presented here, but the reader is taking full responsibility for his/her actions. The author (content curator) of this recipe (code / program) has made every effort to ensure the accuracy of the information was correct at time of publication. The author (content curator) does not assume and hereby disclaims any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from accident, negligence, or any other cause. The information presented here could also be found in public knowledge domains.