Applied Data Science Notebook in Python for Beginners to Professionals

Data Science Project – A Guide to Binary Classification with the Keras Deep Learning Library in Python

Machine Learning for Beginners - A Guide to Binary Classification with the Keras Deep Learning Library in Python

For more projects visit: https://setscholars.net

  • There are 5000+ free end-to-end applied machine learning and data science projects available to download at SETSscholar. SETScholars is a Science, Engineering and Technology Scholars community.
In [6]:
# Suppress warnings in Jupyter Notebooks
import warnings
warnings.filterwarnings("ignore")

In this article, you will discover a guide to Binary Classification with the Keras Deep Learning Library in Python.

Python Codes

Binary Classification with Sonar Dataset: A Baseline Neural Network Model

In [7]:
# Binary Classification with Sonar Dataset: A Baseline Neural Network Model
from pandas import read_csv
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import StratifiedKFold

# load dataset
dataframe = read_csv("sonar.all-data.csv", header=None)
dataset = dataframe.values

# split into input (X) and output (Y) variables
X = dataset[:,0:60].astype(float)
Y = dataset[:,60]

# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)

# baseline model
def create_baseline():
    # create model
    model = Sequential()
    model.add(Dense(60, input_dim=60, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

# evaluate model with standardized dataset
estimator = KerasClassifier(build_fn=create_baseline, epochs=100, batch_size=5, verbose=0)

kfold = StratifiedKFold(n_splits=10, shuffle=True)

results = cross_val_score(estimator, X, encoded_Y, cv=kfold)

print()
print("Baseline Results: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))
Baseline Results: 83.17% (3.23%)

Binary Classification with Sonar Dataset: The Baseline Model With Data Preparation Pipeline

In [8]:
# Binary Classification with Sonar Dataset: The Baseline Model With Data Preparation Pipeline (Standardization)
from pandas import read_csv
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import StratifiedKFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

# load dataset
dataframe = read_csv("sonar.all-data.csv", header=None)
dataset = dataframe.values

# split into input (X) and output (Y) variables
X = dataset[:,0:60].astype(float)
Y = dataset[:,60]

# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)

# baseline model
def create_baseline():
    # create model
    model = Sequential()
    model.add(Dense(60, input_dim=60, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

# evaluate baseline model with standardized dataset
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasClassifier(build_fn=create_baseline, epochs=100, batch_size=5, verbose=0)))

pipeline = Pipeline(estimators)

kfold = StratifiedKFold(n_splits=10, shuffle=True)

results = cross_val_score(pipeline, X, encoded_Y, cv=kfold)

print()
print("Data Preparation Pipeline Results: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))
Data Preparation Pipeline Results: 87.57% (6.40%)
In [ ]:
 

Binary Classification with Sonar Dataset: Tuning Layers and Number of Neurons in The Model

Binary Classification with Sonar Dataset: Standardized Smaller

In [9]:
# Binary Classification with Sonar Dataset: Standardized Smaller
from pandas import read_csv
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import StratifiedKFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

# load dataset
dataframe = read_csv("sonar.all-data.csv", header=None)
dataset = dataframe.values

# split into input (X) and output (Y) variables
X = dataset[:,0:60].astype(float)
Y = dataset[:,60]

# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)

# smaller model
def create_smaller():
    # create model
    model = Sequential()
    model.add(Dense(30, input_dim=60, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasClassifier(build_fn=create_smaller, epochs=100, batch_size=5, verbose=0)))

pipeline = Pipeline(estimators)

kfold = StratifiedKFold(n_splits=10, shuffle=True)

results = cross_val_score(pipeline, X, encoded_Y, cv=kfold)

print()
print("Smaller Network Results: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))
Smaller Network Results: 84.98% (9.77%)

Binary Classification with Sonar Dataset: Standardized Larger

In [10]:
# Binary Classification with Sonar Dataset: Standardized Larger
from pandas import read_csv
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import StratifiedKFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

# load dataset
dataframe = read_csv("sonar.all-data.csv", header=None)
dataset = dataframe.values

# split into input (X) and output (Y) variables
X = dataset[:,0:60].astype(float)
Y = dataset[:,60]

# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)

# larger model
def create_larger():
    # create model
    model = Sequential()
    model.add(Dense(60, input_dim=60, activation='relu'))
    model.add(Dense(30, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasClassifier(build_fn=create_larger, epochs=100, batch_size=5, verbose=0)))

pipeline = Pipeline(estimators)

kfold = StratifiedKFold(n_splits=10, shuffle=True)

results = cross_val_score(pipeline, X, encoded_Y, cv=kfold)

print("Larger Network Results: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))
Larger Network Results: 86.57% (4.59%)
In [ ]:
 

Summary

In this coding recipe, we discussed how to Binary Classification with the Keras Deep Learning Library in Python.

Specifically, we have learned the followings:

  • How to load and prepare data for use in Keras.
  • How to create a baseline neural network model.
  • How to evaluate a Keras model using scikit-learn and stratified k-fold cross validation.
  • How data preparation schemes can lift the performance of your models.
  • How experiments adjusting the network topology can lift model performance.
In [ ]: