Tag Archives: python data science

Machine Learning for Beginners in Python: How to Find Support Vectors

Find Support Vectors Preliminaries /* Load libraries */ from sklearn.svm import SVC from sklearn import datasets from sklearn.preprocessing import StandardScaler import numpy as np Load Iris Flower Dataset /* Load data with only two classes */ iris = datasets.load_iris() X = iris.data[:100,:] y = iris.target[:100] Standardize Features /* Standarize features */ scaler = StandardScaler() X_std …

Machine Learning for Beginners in Python: How to Find Nearest Neighbors

Find Nearest Neighbors Preliminaries /* Load libraries */ from sklearn.neighbors import NearestNeighbors from sklearn import datasets from sklearn.preprocessing import StandardScaler import numpy as np Load Iris Dataset /* Load data */ iris = datasets.load_iris() X = iris.data y = iris.target Standardize Iris Data It is important to standardize our data before we calculate any distances. …

Machine Learning for Beginners in Python: K-Nearest Neighbors Classification

K-Nearest Neighbors Classification Preliminaries import pandas as pd from sklearn import neighbors import numpy as np %matplotlib inline import seaborn Create Dataset Here we create three variables, test_1 and test_2 are our independent variables, ‘outcome’ is our dependent variable. We will use this data to train our learner. training_data = pd.DataFrame() training_data[‘test_1’] = [0.3051,0.4949,0.6974,0.3769,0.2231,0.341,0.4436,0.5897,0.6308,0.5] training_data[‘test_2’] = [0.5846,0.2654,0.2615,0.4538,0.4615,0.8308,0.4962,0.3269,0.5346,0.6731] training_data[‘outcome’] = …

Machine Learning for Beginners in Python: How to Select Important Features In Random Forest

Select Important Features In Random Forest Preliminaries /* Load libraries */ from sklearn.ensemble import RandomForestClassifier from sklearn import datasets from sklearn.feature_selection import SelectFromModel Load Iris Flower Data /* Load data */ iris = datasets.load_iris() X = iris.data y = iris.target Create Random Forest Classifier /* Create random forest classifier */ clf = RandomForestClassifier(random_state=0, n_jobs=-1) Select …

Machine Learning for Beginners in Python: Random Forest Classifier Example

Random Forest Classifier Example This tutorial is based on Yhat’s 2013 tutorial on Random Forests in Python. If you want a good summary of the theory and uses of random forests, I suggest you check out their guide. In the tutorial below, I annotate, correct, and expand on a short code example of random forests they …

Machine Learning for Beginners in Python: Random Forest Classifier

Random Forest Classifier Preliminaries /* Load libraries */ from sklearn.ensemble import RandomForestClassifier from sklearn import datasets Load Iris Data /* Load data */ iris = datasets.load_iris() X = iris.data y = iris.target Create Random Forest Classifier /* Create random forest classifer object that uses entropy */ clf = RandomForestClassifier(criterion=’entropy’, random_state=0, n_jobs=-1) Train Random Forest Classifier …

Machine Learning for Beginners in Python: How to Handle Imbalanced Classes In Random Forest

Handle Imbalanced Classes In Random Forest Preliminaries /* Load libraries */ from sklearn.ensemble import RandomForestClassifier import numpy as np from sklearn import datasets Load Iris Flower Dataset /* Load data */ iris = datasets.load_iris() X = iris.data y = iris.target Adjust Iris Dataset To Make Classes Imbalanced /* Make class highly imbalanced by removing first …

Machine Learning for Beginners in Python: Feature Selection Using Random Forest

Feature Selection Using Random Forest Often in data science we have hundreds or even millions of features and we want a way to create a model that only includes the most important features. This has three benefits. First, we make our model more simple to interpret. Second, we can reduce the variance of the model, …

Machine Learning for Beginners in Python: Feature Importance

Feature Importance Preliminaries /* Load libraries */ from sklearn.ensemble import RandomForestClassifier from sklearn import datasets import numpy as np import matplotlib.pyplot as plt Load Iris Flower Dataset /* Load data */ iris = datasets.load_iris() X = iris.data y = iris.target Train A Decision Tree Model /* Create decision tree classifer object */ clf = RandomForestClassifier(random_state=0, …

Machine Learning for Beginners in Python: Decision Tree Regression

Decision Tree Regression Preliminaries /* Load libraries */ from sklearn.tree import DecisionTreeRegressor from sklearn import datasets Load Boston Housing Dataset /* Load data with only two features */ boston = datasets.load_boston() X = boston.data[:,0:2] y = boston.target Create Decision Tree Decision tree regression works similar to decision tree classification, however instead of reducing Gini impurity …