Group Data By Time Preliminaries /* Import required packages */ import pandas as pd import datetime import numpy as np Next, let’s create some sample data that we can group by time as an sample. In this example I am creating a dataframe with two columns with 365 rows. One column is a date, the …
Month: May 2021
Group A Time Series With pandas Import required modules import pandas as pd import numpy as np Create a dataframe df = pd.DataFrame() df[‘german_army’] = np.random.randint(low=20000, high=30000, size=100) df[‘allied_army’] = np.random.randint(low=20000, high=40000, size=100) df.index = pd.date_range(‘1/1/2014′, periods=100, freq=’H’) df.head() german_army allied_army 2014-01-01 00:00:00 21413 37604 2014-01-01 01:00:00 25913 21144 2014-01-01 02:00:00 22418 34201 2014-01-01 03:00:00 …
Geolocate A City Or Country This tutorial creates a function that attempts to take a city and country and return its latitude and longitude. But when the city is unavailable (which is often be the case), the returns the latitude and longitude of the center of the country. Preliminaries from geopy.geocoders import Nominatim geolocator = …
Geolocate A City And Country This tutorial creates a function that attempts to take a city and country and return its latitude and longitude. But when the city is unavailable (which is often be the case), the returns the latitude and longitude of the center of the country. Preliminaries from geopy.geocoders import Nominatim geolocator = …
Geocoding And Reverse Geocoding Geocoding (converting a physical address or location into latitude/longitude) and reverse geocoding (converting a lat/long to a physical address or location) are common tasks when working with geo-data. Python offers a number of packages to make the task incredibly easy. In the tutorial below, I use pygeocoder, a wrapper for Google’s …
Find Unique Values In Pandas Dataframes import pandas as pd import numpy as np raw_data = {‘regiment’: [’51st’, ’29th’, ‘2nd’, ’19th’, ’12th’, ‘101st’, ’90th’, ’30th’, ‘193th’, ‘1st’, ’94th’, ’91th’], ‘trucks’: [‘MAZ-7310’, np.nan, ‘MAZ-7310’, ‘MAZ-7310’, ‘Tatra 810’, ‘Tatra 810’, ‘Tatra 810’, ‘Tatra 810’, ‘ZIS-150’, ‘Tatra 810’, ‘ZIS-150’, ‘ZIS-150’], ‘tanks’: [‘Merkava Mark 4’, ‘Merkava Mark 4’, ‘Merkava …
Find Largest Value In A Dataframe Column /* import modules */ %matplotlib inline import pandas as pd import matplotlib.pyplot as plt import numpy as np /* Create dataframe */ raw_data = {‘first_name’: [‘Jason’, ‘Molly’, ‘Tina’, ‘Jake’, ‘Amy’], ‘last_name’: [‘Miller’, ‘Jacobson’, ‘Ali’, ‘Milner’, ‘Cooze’], ‘age’: [42, 52, 36, 24, 73], ‘preTestScore’: [4, 24, 31, 2, 3], …
Filter pandas Dataframes Import modules import pandas as pd Create Dataframe data = {‘name’: [‘Jason’, ‘Molly’, ‘Tina’, ‘Jake’, ‘Amy’], ‘year’: [2012, 2012, 2013, 2014, 2014], ‘reports’: [4, 24, 31, 2, 3], ‘coverage’: [25, 94, 57, 62, 70]} df = pd.DataFrame(data, index = [‘Cochice’, ‘Pima’, ‘Santa Cruz’, ‘Maricopa’, ‘Yuma’]) df coverage name reports year Cochice 25 …
Expand Cells Containing Lists Into Their Own Variables In Pandas /* import pandas */ import pandas as pd /* create a dataset */ raw_data = {‘score’: [1,2,3], ‘tags’: [[‘apple’,’pear’,’guava’],[‘truck’,’car’,’plane’],[‘cat’,’dog’,’mouse’]]} df = pd.DataFrame(raw_data, columns = [‘score’, ‘tags’]) /* view the dataset */ df score tags 0 1 apple,pear,guavaapple,pear,guava 1 2 truck,car,planetruck,car,plane 2 3 cat,dog,mousecat,dog,mouse /* expand …
Dropping Rows And Columns In pandas Dataframe Import modules import pandas as pd Create a dataframe data = {‘name’: [‘Jason’, ‘Molly’, ‘Tina’, ‘Jake’, ‘Amy’], ‘year’: [2012, 2012, 2013, 2014, 2014], ‘reports’: [4, 24, 31, 2, 3]} df = pd.DataFrame(data, index = [‘Cochice’, ‘Pima’, ‘Santa Cruz’, ‘Maricopa’, ‘Yuma’]) df name reports year Cochice Jason 4 2012 …
Descriptive Statistics For pandas Dataframe Import modules import pandas as pd Create dataframe data = {‘name’: [‘Jason’, ‘Molly’, ‘Tina’, ‘Jake’, ‘Amy’], ‘age’: [42, 52, 36, 24, 73], ‘preTestScore’: [4, 24, 31, 2, 3], ‘postTestScore’: [25, 94, 57, 62, 70]} df = pd.DataFrame(data, columns = [‘name’, ‘age’, ‘preTestScore’, ‘postTestScore’]) df name age preTestScore postTestScore 0 Jason …
Delete Duplicates In pandas import modules import pandas as pd Create dataframe with duplicates raw_data = {‘first_name’: [‘Jason’, ‘Jason’, ‘Jason’,’Tina’, ‘Jake’, ‘Amy’], ‘last_name’: [‘Miller’, ‘Miller’, ‘Miller’,’Ali’, ‘Milner’, ‘Cooze’], ‘age’: [42, 42, 1111111, 36, 24, 73], ‘preTestScore’: [4, 4, 4, 31, 2, 3], ‘postTestScore’: [25, 25, 25, 57, 62, 70]} df = pd.DataFrame(raw_data, columns = [‘first_name’, …