Tag Archives: python for data analyst

How to List Files in a Directory using Python

Hits: 9 How to List Files in a Directory using Python Using os.walk() The os module contains a long list of methods that deal with the filesystem, and the operating system. One of them is walk(), which generates the filenames in a directory tree by walking the tree either top-down or bottom-up (with top-down being the default setting). os.walk() returns …

How to do Parallel Processing in Python

Hits: 1 How to do Parallel Processing in Python Introduction When you start a program on your machine it runs in its own “bubble” which is completely separate from other programs that are active at the same time. This “bubble” is called a process, and comprises everything which is needed to manage this program call. For …

How to Read a File Line-by-Line in Python

Hits: 0 Google –> SETScholars. How to Read a File Line-by-Line in Python Introduction Over the course of my working life I have had the opportunity to use many programming concepts and technologies to do countless things. Some of these things involve relatively low-value fruits of my labor, such as automating the error prone or …

What is Command Line Arguments in Python

Hits: 0 What is Command Line Arguments in Python Overview With Python being such a popular programming language, as well as having support for most operating systems, it’s become widely used to create command line tools for many purposes. These tools can range from simple CLI apps to those that are more complex, like AWS’ awscli tool. …

Data Viz in Python – Pie Chart In MatPlotLib

Hits: 3 Pie Chart In MatPlotLib Preliminaries %matplotlib inline import pandas as pd import matplotlib.pyplot as plt Create dataframe raw_data = {‘officer_name’: [‘Jason’, ‘Molly’, ‘Tina’, ‘Jake’, ‘Amy’], ‘jan_arrests’: [4, 24, 31, 2, 3], ‘feb_arrests’: [25, 94, 57, 62, 70], ‘march_arrests’: [5, 43, 23, 23, 51]} df = pd.DataFrame(raw_data, columns = [‘officer_name’, ‘jan_arrests’, ‘feb_arrests’, ‘march_arrests’]) df …

Data Viz in Python – Color Palettes in Seaborn

Hits: 1 Color Palettes in Seaborn Preliminaries import pandas as pd %matplotlib inline import matplotlib.pyplot as plt import seaborn as sns data = {‘date’: [‘2014-05-01 18:47:05.069722’, ‘2014-05-01 18:47:05.119994’, ‘2014-05-02 18:47:05.178768’, ‘2014-05-02 18:47:05.230071’, ‘2014-05-02 18:47:05.230071’, ‘2014-05-02 18:47:05.280592’, ‘2014-05-03 18:47:05.332662’, ‘2014-05-03 18:47:05.385109’, ‘2014-05-04 18:47:05.436523’, ‘2014-05-04 18:47:05.486877’], ‘deaths_regiment_1’: [34, 43, 14, 15, 15, 14, 31, 25, 62, 41], …

Data Wrangling in Python – Pandas Time Series Basics

Hits: 4 Pandas Time Series Basics Import modules from datetime import datetime import pandas as pd %matplotlib inline import matplotlib.pyplot as pyplot Create a dataframe data = {‘date’: [‘2014-05-01 18:47:05.069722’, ‘2014-05-01 18:47:05.119994’, ‘2014-05-02 18:47:05.178768’, ‘2014-05-02 18:47:05.230071’, ‘2014-05-02 18:47:05.230071’, ‘2014-05-02 18:47:05.280592’, ‘2014-05-03 18:47:05.332662’, ‘2014-05-03 18:47:05.385109’, ‘2014-05-04 18:47:05.436523’, ‘2014-05-04 18:47:05.486877’], ‘battle_deaths’: [34, 25, 26, 15, 15, 14, …

Data Wrangling in Python – Pandas Data Structures

Hits: 25 Pandas Data Structures Import modules import pandas as pd Series 101 Series are one-dimensional arrays (like R’s vectors) Create a series of the number of floodingReports floodingReports = pd.Series([5, 6, 2, 9, 12]) floodingReports 0 5 1 6 2 2 3 9 4 12 dtype: int64 Note that the first column of numbers …

Data Wrangling in Python – How to Use Seaborn To Visualize A pandas Dataframe

Hits: 9 Using Seaborn To Visualize A pandas Dataframe Preliminaries import pandas as pd %matplotlib inline import random import matplotlib.pyplot as plt import seaborn as sns df = pd.DataFrame() df[‘x’] = random.sample(range(1, 100), 25) df[‘y’] = random.sample(range(1, 100), 25) df.head() x y 0 18 25 1 42 67 2 52 77 3 4 34 4 …

Data Wrangling in Python – How to Use List Comprehensions With pandas

Hits: 7 Using List Comprehensions With pandas Preliminaries /* Import modules */ import pandas as pd /* Set ipython’s max row display */ pd.set_option(‘display.max_row’, 1000) /* Set iPython’s max column width to 50 */ pd.set_option(‘display.max_columns’, 50) Create an example dataframe data = {‘name’: [‘Jason’, ‘Molly’, ‘Tina’, ‘Jake’, ‘Amy’], ‘year’: [2012, 2012, 2013, 2014, 2014], ‘reports’: …

Data Wrangling in Python – How to Sort Rows In pandas Dataframes

Hits: 5 Sorting Rows In pandas Dataframes import modules import pandas as pd Create dataframe data = {‘name’: [‘Jason’, ‘Molly’, ‘Tina’, ‘Jake’, ‘Amy’], ‘year’: [2012, 2012, 2013, 2014, 2014], ‘reports’: [1, 2, 1, 2, 3], ‘coverage’: [2, 2, 3, 3, 3]} df = pd.DataFrame(data, index = [‘Cochice’, ‘Pima’, ‘Santa Cruz’, ‘Maricopa’, ‘Yuma’]) df coverage name …

Learn Python By Example – Example Dataframes In pandas

Hits: 7 Simple Example Dataframes In pandas import modules import pandas as pd Create dataframe raw_data = {‘first_name’: [‘Jason’, ‘Molly’, ‘Tina’, ‘Jake’, ‘Amy’], ‘last_name’: [‘Miller’, ‘Jacobson’, ‘Ali’, ‘Milner’, ‘Cooze’], ‘age’: [42, 52, 36, 24, 73], ‘preTestScore’: [4, 24, 31, 2, 3], ‘postTestScore’: [25, 94, 57, 62, 70]} df = pd.DataFrame(raw_data, columns = [‘first_name’, ‘last_name’, ‘age’, …

Data Wrangling in Python – How to Select pandas DataFrame Rows Based On Conditions

Hits: 6 Selecting pandas DataFrame Rows Based On Conditions Preliminaries /* Import modules */ import pandas as pd import numpy as np /* Create a dataframe */ raw_data = {‘first_name’: [‘Jason’, ‘Molly’, np.nan, np.nan, np.nan], ‘nationality’: [‘USA’, ‘USA’, ‘France’, ‘UK’, ‘UK’], ‘age’: [42, 52, 36, 24, 70]} df = pd.DataFrame(raw_data, columns = [‘first_name’, ‘nationality’, ‘age’]) …

Data Wrangling in Python – How to Select Rows With A Certain Value

Hits: 5 Select Rows With A Certain Value import pandas as pd /* Create an example dataframe */ data = {‘name’: [‘Jason’, ‘Molly’], ‘country’: [[‘Syria’, ‘Lebanon’],[‘Spain’, ‘Morocco’]]} df = pd.DataFrame(data) df country name 0 Syria,LebanonSyria,Lebanon Jason 1 Spain,MoroccoSpain,Morocco Molly df[df[‘country’].map(lambda country: ‘Syria’ in country)] country name 0 Syria,LebanonSyria,Lebanon Jason Python Example for Beginners Special 95% …

Data Wrangling in Python – How to Search A pandas Column For A Value

Hits: 3 Search A pandas Column For A Value /* Import modules */ import pandas as pd raw_data = {‘first_name’: [‘Jason’, ‘Jason’, ‘Tina’, ‘Jake’, ‘Amy’], ‘last_name’: [‘Miller’, ‘Miller’, ‘Ali’, ‘Milner’, ‘Cooze’], ‘age’: [42, 42, 36, 24, 73], ‘preTestScore’: [4, 4, 31, 2, 3], ‘postTestScore’: [25, 25, 57, 62, 70]} df = pd.DataFrame(raw_data, columns = [‘first_name’, …