Every ML practitioner has attempted Kaggle's Dogs v/s Cats competition which pretty much is a solved problem thanks to methods like Transfer Learning on top of architectures like ResNet. But this dataset can be used to show some cool techniques by self imposing certain constraints, like using only a fraction of the data to obtain the same results!
This post is to illustrate how - with modern techniques - we can obtain near state-of-the-art results on image classification tasks. I'm going to use Kaggle's Dogs v/s Cats to prove my points, but these results can be extended to any similar task, including microscopic images, satellite images, and the works.
I'm using Keras with a Tensorflow backend, but the same results can be obtained on any modern deep learning library thanks to great implementations of the above techniques being readily available.
Firstly, I import all necessary libraries:
import numpy as np
import cv2
import os
import shutil
from glob import glob
import matplotlib.pyplot as plt
%matplotlib inline
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
from keras.models import *
from keras.layers import *
from keras.regularizers import *
from keras.optimizers import *
from IPython.display import Image
from keras import applications
I ran these experiments on an NVIDIA Titan X, so I could've afforded to use a higher batch size than 64 to speed things up.
batch_size = 64
img_width, img_height = 200, 200
Now, I create the directory structure I need for training and validation
%mkdir -p data/train/cats
%mkdir -p data/validate/cats
%mkdir -p data/train/dogs
%mkdir -p data/validate/dogs
This is the interesting bit, the original Dogs v/s Cats dataset has ~11,000 images for each category. So to prove my point that we can get good results with a very small amount of data, I select only the first 2000 images, i.e. 1000 per class.
This is approximately only 1/8th of the actual dataset.
train_imgs = os.listdir('input/train')[:2000]
valid_imgs = os.listdir('input/train')[2000:2800]
for i in train_imgs:
if i[:3] == 'cat':
shutil.copy2('input/train/' + i, 'data/train/cats/')
elif i[:3] == 'dog':
shutil.copy2('input/train/' + i, 'data/train/dogs/')
for i in valid_imgs:
if i[:3] == 'cat':
shutil.copy2('input/train/' + i, 'data/validate/cats/')
elif i[:3] == 'dog':
shutil.copy2('input/train/' + i, 'data/validate/dogs/')
Augmenting data with a data generator with random affine transformations.
One of my favorite features in Keras is how smoothly it integrates with Python's generators and it's flow_from_directory
method. This makes augmentation much simpler and avoids us storing all augmented images in disk.
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
shear_range=0.1,)
train_generator = train_datagen.flow_from_directory(
'data/train',
target_size=(img_height,img_width),
batch_size=batch_size,
class_mode='binary')
I chose the above augmentation parameters by some trial and error. The best way to do this is to run the augmentation on a few samples and visualize the results. Our goal is to create images which still inherently preserve the characteristics of dogs or cats. Hence transformations like vertical flipping might not help.
So, the best thing to do is to plot these images, and keep tinkering with the values till you reach a satisfactory set.
images = os.listdir('preview')
plt.figure(figsize=(20,10))
columns = 5
for i, image in enumerate(images):
plt.subplot(len(images) / columns + 1, columns, i + 1)
image = load_img('preview/' + images[i])
plt.imshow(image)
The validation set will only be rescaled to the size of the training set. But there are studies which show test time augmentation (TTA) or inference time augmentation - where you take the average of the results of a few augmented images on the test/validation set - give better results.
Here are some references - Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification by He, et al., 2015 where the mention performing a "multi-view testing on feature maps".
val_datagen = ImageDataGenerator(rescale=1./255)
val_generator = val_datagen.flow_from_directory(
'data/validate',
target_size=(img_height,img_width),
batch_size=batch_size,
class_mode='binary')
classes = len(train_generator.class_indices)
assert classes is len(val_generator.class_indices)
nb_train_samples = train_generator.samples
nb_val_samples = val_generator.samples
Here is another self-imposed constraint - I wont be using a pre-trained network and instead will use a conventional model with three convolution layers with some Dropout layers and Batch Normalization.
model = Sequential([
BatchNormalization(axis=1, input_shape=(img_width,img_height,3)),
Convolution2D(32, (3,3), activation='relu'),
BatchNormalization(axis=1),
MaxPooling2D(),
Convolution2D(64, (3,3), activation='relu'),
BatchNormalization(axis=1),
MaxPooling2D(),
Convolution2D(128, (3,3), activation='relu'),
BatchNormalization(axis=1),
MaxPooling2D(),
Flatten(),
Dense(200, activation='relu'),
BatchNormalization(),
Dropout(0.5),
Dense(200, activation='relu'),
BatchNormalization(),
Dropout(0.5),
Dense(1, activation='sigmoid')
])
Now, I preform (quite a few) iterations of training and hyperparameter tuning.
model.compile(Adam(lr=1e-4), loss='binary_crossentropy', metrics=['accuracy'])
model.summary()
model.fit_generator(train_generator,
steps_per_epoch=nb_train_samples//batch_size,
epochs=15,
validation_data=val_generator,
validation_steps=nb_val_samples//batch_size)
model.fit_generator(train_generator,
steps_per_epoch=nb_train_samples//batch_size,
epochs=20,
validation_data=val_generator,
validation_steps=nb_val_samples//batch_size)
model.optimizer.lr=0.00005
model.fit_generator(train_generator,
steps_per_epoch=nb_train_samples//batch_size,
epochs=20,
validation_data=val_generator,
validation_steps=nb_val_samples//batch_size)
I'm still at it...
model.optimizer.lr=0.00001
model.fit_generator(train_generator,
steps_per_epoch=nb_train_samples//batch_size,
epochs=20,
validation_data=val_generator,
validation_steps=nb_val_samples//batch_size)
model.optimizer.lr=0.000004
model.fit_generator(train_generator,
steps_per_epoch=nb_train_samples//batch_size,
epochs=15,
validation_data=val_generator,
validation_steps=nb_val_samples//batch_size)
model.save_weights('data_aug.h5')
test_datagen = ImageDataGenerator(rescale=1. / 255)
test_batches = test_datagen.flow_from_directory('sample/', target_size=(img_height,img_width), shuffle=False, batch_size=1, class_mode=None)
images = os.listdir('sample/unknown')
predictions = model.predict_generator(test_batches)
predictions = predictions.reshape((11,))
plt.figure(figsize=(20,10))
columns = 5
for i, image in enumerate(images):
plt.subplot(len(images) / columns + 1, columns, i + 1)
image = load_img('sample/unknown/' + images[i])
if predictions[i] > 0.5:
plt.title('dog')
else:
plt.title('cat')
plt.imshow(image)
test_batches = test_datagen.flow_from_directory('pseudo_labelling/', target_size=(img_height,img_width), shuffle=False, batch_size=batch_size, class_mode=None)
predictions = model.predict_generator(test_batches)
predictions = predictions.reshape((predictions.shape[0],))
files = os.listdir('pseudo_labelling/unknown')
for i in range(len(files)):
if predictions[i] > 0.5:
os.rename('pseudo_labelling/unknown/' + files[i], 'data/train/dogs/' + files[i])
else:
os.rename('pseudo_labelling/unknown/' + files[i], 'data/train/cats/' + files[i])
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
shear_range=0.1)
train_generator = train_datagen.flow_from_directory(
'data/train',
target_size=(img_height,img_width),
batch_size=batch_size,
class_mode='binary')
val_datagen = ImageDataGenerator(rescale=1./255)
val_generator = val_datagen.flow_from_directory(
'data/validate',
target_size=(img_height,img_width),
batch_size=batch_size,
class_mode='binary')
classes = len(train_generator.class_indices)
assert classes is len(val_generator.class_indices)
nb_train_samples = train_generator.samples
nb_val_samples = val_generator.samples
model.load_weights('data_aug.h5')
model.fit_generator(train_generator,
steps_per_epoch=nb_train_samples//batch_size,
epochs=15,
validation_data=val_generator,
validation_steps=nb_val_samples//batch_size)