(Python Example for Beginners)
Write a Pandas program to split a given dataset, group by one column and apply an aggregate function to few columns and another aggregate function to the rest of the columns of the dataframe.
Test Data:
salesman_id sale_jan sale_feb sale_mar sale_apr sale_may sale_jun 0 5002 150.50 250.50 150.50 150.50 130.50 150.50 1 5005 270.65 170.65 270.65 270.65 270.65 270.65 2 5001 65.26 15.26 65.26 95.26 65.26 45.26 3 5003 110.50 110.50 110.50 210.50 310.50 110.50 4 5002 948.50 598.50 948.50 948.50 948.50 948.50 5 5001 2400.60 1400.60 2400.60 2400.60 2400.60 3400.60 6 5001 1760.00 2760.00 5760.00 760.00 760.00 5760.00 7 5006 2983.43 1983.43 1983.43 1983.43 1983.43 983.43 8 5003 480.40 2480.40 2480.40 2480.40 2480.40 2480.40 9 5002 1250.45 250.45 250.45 250.45 250.45 250.45 10 5007 75.29 75.29 75.29 75.29 75.29 75.29 11 5001 1045.60 3045.60 3045.60 3045.60 3045.60 3045.60
sale_jul sale_aug sale_sep sale_oct sale_nov sale_dec 0 950.50 150.50 150.50 150.50 150.50 150.50 1 270.65 70.65 270.65 270.65 270.65 70.65 2 65.26 65.26 65.26 65.26 95.26 65.26 3 210.50 110.50 110.50 110.50 110.50 110.50 4 948.50 948.50 948.50 948.50 948.50 948.50 5 2400.60 400.60 200.60 2400.60 2400.60 2400.60 6 5760.00 5760.00 5760.00 5760.00 5760.00 5760.00 7 983.43 1983.43 1983.43 1983.43 1983.43 1983.43 8 2480.40 2480.40 2480.40 2480.40 2480.40 2480.40 9 250.45 250.45 250.45 250.45 250.45 250.45 10 75.29 75.29 75.29 75.29 75.29 75.29 11 3045.60 3045.60 3045.60 3045.60 3045.60 3045.60
Sample Solution:
Python Code :
Sample Output:
Original Orders DataFrame: salesman_id sale_jan sale_feb sale_mar sale_apr sale_may sale_jun 0 5002 150.50 250.50 150.50 150.50 130.50 150.50 1 5005 270.65 170.65 270.65 270.65 270.65 270.65 2 5001 65.26 15.26 65.26 95.26 65.26 45.26 3 5003 110.50 110.50 110.50 210.50 310.50 110.50 4 5002 948.50 598.50 948.50 948.50 948.50 948.50 5 5001 2400.60 1400.60 2400.60 2400.60 2400.60 3400.60 6 5001 1760.00 2760.00 5760.00 760.00 760.00 5760.00 7 5006 2983.43 1983.43 1983.43 1983.43 1983.43 983.43 8 5003 480.40 2480.40 2480.40 2480.40 2480.40 2480.40 9 5002 1250.45 250.45 250.45 250.45 250.45 250.45 10 5007 75.29 75.29 75.29 75.29 75.29 75.29 11 5001 1045.60 3045.60 3045.60 3045.60 3045.60 3045.60 sale_jul sale_aug sale_sep sale_oct sale_nov sale_dec 0 950.50 150.50 150.50 150.50 150.50 150.50 1 270.65 70.65 270.65 270.65 270.65 70.65 2 65.26 65.26 65.26 65.26 95.26 65.26 3 210.50 110.50 110.50 110.50 110.50 110.50 4 948.50 948.50 948.50 948.50 948.50 948.50 5 2400.60 400.60 200.60 2400.60 2400.60 2400.60 6 5760.00 5760.00 5760.00 5760.00 5760.00 5760.00 7 983.43 1983.43 1983.43 1983.43 1983.43 1983.43 8 2480.40 2480.40 2480.40 2480.40 2480.40 2480.40 9 250.45 250.45 250.45 250.45 250.45 250.45 10 75.29 75.29 75.29 75.29 75.29 75.29 11 3045.60 3045.60 3045.60 3045.60 3045.60 3045.60 Result after group on salesman_id and apply different aggregate functions: sale_jan sale_feb sale_mar sale_apr sale_may sale_jun salesman_id 5001 5271.46 7221.46 11271.46 1575.365000 1567.865 3062.865000 5002 2349.45 1099.45 1349.45 449.816667 443.150 449.816667 5003 590.90 2590.90 2590.90 1345.450000 1395.450 1295.450000 5005 270.65 170.65 270.65 270.650000 270.650 270.650000 5006 2983.43 1983.43 1983.43 1983.430000 1983.430 983.430000 5007 75.29 75.29 75.29 75.290000 75.290 75.290000 sale_jul sale_aug sale_sep sale_oct sale_nov salesman_id 5001 2817.865000 2317.865000 2267.865000 2817.865000 2825.365000 5002 716.483333 449.816667 449.816667 449.816667 449.816667 5003 1345.450000 1295.450000 1295.450000 1295.450000 1295.450000 5005 270.650000 70.650000 270.650000 270.650000 270.650000 5006 983.430000 1983.430000 1983.430000 1983.430000 1983.430000 5007 75.290000 75.290000 75.290000 75.290000 75.290000 sale_dec salesman_id 5001 2817.865000 5002 449.816667 5003 1295.450000 5005 70.650000 5006 1983.430000 5007 75.290000
Python Example for Beginners
Two Machine Learning Fields
There are two sides to machine learning:
- Practical Machine Learning:This is about querying databases, cleaning data, writing scripts to transform data and gluing algorithm and libraries together and writing custom code to squeeze reliable answers from data to satisfy difficult and ill defined questions. It’s the mess of reality.
- Theoretical Machine Learning: This is about math and abstraction and idealized scenarios and limits and beauty and informing what is possible. It is a whole lot neater and cleaner and removed from the mess of reality.
Disclaimer: The information and code presented within this recipe/tutorial is only for educational and coaching purposes for beginners and developers. Anyone can practice and apply the recipe/tutorial presented here, but the reader is taking full responsibility for his/her actions. The author (content curator) of this recipe (code / program) has made every effort to ensure the accuracy of the information was correct at time of publication. The author (content curator) does not assume and hereby disclaims any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from accident, negligence, or any other cause. The information presented here could also be found in public knowledge domains.