Pandas Example – Write a Pandas program to split a given dataset using group by on specified column into two labels and ranges

(Python Example for Beginners)

 

Write a Pandas program to split a given dataset using group by on specified column into two labels and ranges.

Split the group on ‘salesman_id’,
Ranges:
1) (5001…5006)
2) (5007..5012)

 

Test Data:

    salesman_id  sale_jan
0          5001    150.50
1          5002    270.65
2          5003     65.26
3          5004    110.50
4          5005    948.50
5          5006   2400.60
6          5007   1760.00
7          5008   2983.43
8          5009    480.40
9          5010   1250.45
10         5011     75.29
11         5012   1045.60   

 

Sample Solution:

Python Code :


import pandas as pd
import numpy as np

pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)

df = pd.DataFrame({
'salesman_id': [5001,5002,5003,5004,5005,5006,5007,5008,5009,5010,5011,5012],
'sale_jan':[150.5, 270.65, 65.26, 110.5, 948.5, 2400.6, 1760, 2983.43, 480.4,  1250.45, 75.29,1045.6]})

print("Original Orders DataFrame:")
print(df)

result = df.groupby(pd.cut(df['salesman_id'], 
                  bins=[0,5006,np.inf],  
                  labels=['S1', 'S2']))['sale_jan'].sum().reset_index()

print("nGroupBy with condition of  two labels and ranges:")
print(result)

Sample Output:

Original Orders DataFrame:
    salesman_id  sale_jan
0          5001    150.50
1          5002    270.65
2          5003     65.26
3          5004    110.50
4          5005    948.50
5          5006   2400.60
6          5007   1760.00
7          5008   2983.43
8          5009    480.40
9          5010   1250.45
10         5011     75.29
11         5012   1045.60

GroupBy with condition of  two labels and ranges:
  salesman_id  sale_jan
0          S1   3946.01
1          S2   7595.17

 

Python Example for Beginners

Sign up to get end-to-end “Learn By Coding” example.


Two Machine Learning Fields

There are two sides to machine learning:

  • Practical Machine Learning:This is about querying databases, cleaning data, writing scripts to transform data and gluing algorithm and libraries together and writing custom code to squeeze reliable answers from data to satisfy difficult and ill defined questions. It’s the mess of reality.
  • Theoretical Machine Learning: This is about math and abstraction and idealized scenarios and limits and beauty and informing what is possible. It is a whole lot neater and cleaner and removed from the mess of reality.
Disclaimer: The information and code presented within this recipe/tutorial is only for educational and coaching purposes for beginners and developers. Anyone can practice and apply the recipe/tutorial presented here, but the reader is taking full responsibility for his/her actions. The author (content curator) of this recipe (code / program) has made every effort to ensure the accuracy of the information was correct at time of publication. The author (content curator) does not assume and hereby disclaims any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from accident, negligence, or any other cause. The information presented here could also be found in public knowledge domains.