In this post, we will learn how to build a deep learning model in PyTorch by using the CIFAR-10 dataset.

PyTorch

PyTorch is a Machine Learning Library created by Facebook. It works with tensors, which can be defined as a n-dimension matrix from which you can perform mathematical operations and build Deep Learning Models.

Deep Learning

This subfield of AI seeks to emulate the learning approach that humans use to obtain certain types of knowledge. In its simplest form, deep learning can be seen as a way to automate predictive analytics.

CIFAR-10 Dataset

The CIFAR-10 dataset consists of 60000 32×32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

You can find more information about CIFAR-10 dataset from here.

Deep Learning Model Implementation

For the implementation of this deep learning model, we will go through the following steps:

  1. Import libraries
  2. Preparing the data
  3. Model
  4. Using a GPU
  5. Training the model

Import libraries

In [2]:

import torch
import torchvision
import numpy as np
import matplotlib.pyplot as plt
import torch.nn as nn
import torch.nn.functional as F
from torchvision.datasets import CIFAR10
from torchvision.transforms import ToTensor
from torchvision.utils import make_grid
from torch.utils.data.dataloader import DataLoader
from torch.utils.data import random_split
%matplotlib inline

Preparing the Data

In [3]:

dataset = CIFAR10(root='data/', download=True, transform=ToTensor())
test_dataset = CIFAR10(root='data/', train=False, transform=ToTensor())
Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to data/cifar-10-python.tar.gz
Extracting data/cifar-10-python.tar.gz to data/

Here, we imported the datasets and converted the images into PyTorch tensors.

In [4]:

classes = dataset.classes
classes

Out[4]:

['airplane',
 'automobile',
 'bird',
 'cat',
 'deer',
 'dog',
 'frog',
 'horse',
 'ship',
 'truck']

By using the classes method, we can get the image classes from the dataset.

In [5]:

class_count = {}
for _, index in dataset:
    label = classes[index]
    if label not in class_count:
        class_count[label] = 0
    class_count[label] += 1
class_count

Out[5]:

{'frog': 5000,
 'truck': 5000,
 'deer': 5000,
 'automobile': 5000,
 'bird': 5000,
 'horse': 5000,
 'ship': 5000,
 'cat': 5000,
 'dog': 5000,
 'airplane': 5000}

With this for loop, we can get the number of images per class. It goes through all the dataset, add the class name to a dictionary if it doesn’t exist there yet and counts each image per class.

Now, we’ll split the dataset into two groups: training and validation datasets.

In [6]:

torch.manual_seed(43)
val_size = 5000
train_size = len(dataset) - val_size

We used a validation set with 5000 images (10% of the dataset). To ensure we get the same validation set each time, we set PyTorch’s random number generator to a seed value of 43.

In [7]:

train_ds, val_ds = random_split(dataset, [train_size, val_size])
len(train_ds), len(val_ds)

Out[7]:

(45000, 5000)

Here, we used the random_split method to create the training and validations sets.

In [8]:

batch_size=128

In [9]:

train_loader = DataLoader(train_ds, batch_size, shuffle=True, num_workers=4, pin_memory=True)
val_loader = DataLoader(val_ds, batch_size*2, num_workers=4, pin_memory=True)
test_loader = DataLoader(test_dataset, batch_size*2, num_workers=4, pin_memory=True)

We created dataloaders for training, validation and test sets. We set shuffle=True for the training dataloader, so that the batches generated in each epoch are different, and this randomization helps generalize & speed up the training process. On the other hand, since the validation dataloader is used only for evaluating the model, there is no need to shuffle the images.

Also, we set pin_memory=True because we will push the data from the CPU into the GPU and this parameter lets the DataLoader allocate the samples in page-locked memory, which speeds-up the transfer.

In [10]:

for images, _ in train_loader:
    print('images.shape:', images.shape)
    plt.figure(figsize=(16,8))
    plt.axis('off')
    plt.imshow(make_grid(images, nrow=16).permute((1, 2, 0)))
    break
images.shape: torch.Size([128, 3, 32, 32])

Here, we can visualize a batch of data using the make_grid helper function from Torchvision.

Model

In [12]:

def accuracy(outputs, labels):
    _, preds = torch.max(outputs, dim=1)
    return torch.tensor(torch.sum(preds == labels).item() / len(preds))

In [13]:

class ImageClassificationBase(nn.Module):
    def training_step(self, batch):
        images, labels = batch 
        out = self(images)                  # Generate predictions
        loss = F.cross_entropy(out, labels) # Calculate loss
        return loss

    def validation_step(self, batch):
        images, labels = batch 
        out = self(images)                    # Generate predictions
        loss = F.cross_entropy(out, labels)   # Calculate loss
        acc = accuracy(out, labels)           # Calculate accuracy
        return {'val_loss': loss.detach(), 'val_acc': acc}

    def validation_epoch_end(self, outputs):
        batch_losses = [x['val_loss'] for x in outputs]
        epoch_loss = torch.stack(batch_losses).mean()   # Combine losses
        batch_accs = [x['val_acc'] for x in outputs]
        epoch_acc = torch.stack(batch_accs).mean()      # Combine accuracies
        return {'val_loss': epoch_loss.item(), 'val_acc': epoch_acc.item()}

    def epoch_end(self, epoch, result):
        print("Epoch [{}], val_loss: {:.4f}, val_acc: {:.4f}".format(epoch, result['val_loss'], result['val_acc']))

In [14]:

def evaluate(model, val_loader):
    outputs = [model.validation_step(batch) for batch in val_loader]
    return model.validation_epoch_end(outputs)

def fit(epochs, lr, model, train_loader, val_loader, opt_func=torch.optim.SGD):
    history = []
    optimizer = opt_func(model.parameters(), lr)
    for epoch in range(epochs):
        # Training Phase 
        for batch in train_loader:
            loss = model.training_step(batch)
            loss.backward()
            optimizer.step()
            optimizer.zero_grad()
        # Validation phase
        result = evaluate(model, val_loader)
        model.epoch_end(epoch, result)
        history.append(result)
    return history

Using a GPU

In [15]:

torch.cuda.is_available()

Out[15]:

False

In [16]:

def get_default_device():
    """Pick GPU if available, else CPU"""
    if torch.cuda.is_available():
        return torch.device('cuda')
    else:
        return torch.device('cpu')

In [17]:

device = get_default_device()
device

Out[17]:

device(type='cpu')

In [18]:

def to_device(data, device):
    """Move tensor(s) to chosen device"""
    if isinstance(data, (list,tuple)):
        return [to_device(x, device) for x in data]
    return data.to(device, non_blocking=True)

class DeviceDataLoader():
    """Wrap a dataloader to move data to a device"""
    def __init__(self, dl, device):
        self.dl = dl
        self.device = device

    def __iter__(self):
        """Yield a batch of data after moving it to device"""
        for b in self.dl: 
            yield to_device(b, self.device)

    def __len__(self):
        """Number of batches"""
        return len(self.dl)

In [19]:

def plot_losses(history):
    losses = [x['val_loss'] for x in history]
    plt.plot(losses, '-x')
    plt.xlabel('epoch')
    plt.ylabel('loss')
    plt.title('Loss vs. No. of epochs');

In [20]:

def plot_accuracies(history):
    accuracies = [x['val_acc'] for x in history]
    plt.plot(accuracies, '-x')
    plt.xlabel('epoch')
    plt.ylabel('accuracy')
    plt.title('Accuracy vs. No. of epochs');

In [21]:

train_loader = DeviceDataLoader(train_loader, device)
val_loader = DeviceDataLoader(val_loader, device)
test_loader = DeviceDataLoader(test_loader, device)

Training the Model

In [22]:

input_size = 3*32*32
output_size = 10

In [23]:

class CIFAR10Model(ImageClassificationBase):
    def __init__(self):
        super().__init__()
        self.linear1 = nn.Linear(input_size, 256)
        self.linear2 = nn.Linear(256, 128)
        self.linear3 = nn.Linear(128, output_size)

    def forward(self, xb):
        # Flatten images into vectors
        out = xb.view(xb.size(0), -1)
        # Apply layers & activation functions
        out = self.linear1(out)
        out = F.relu(out)
        out = self.linear2(out)
        out = F.relu(out)
        out = self.linear3(out)
        return out

In [24]:

model = to_device(CIFAR10Model(), device)

In [25]:

history = [evaluate(model, val_loader)]
history

Out[25]:

[{'val_loss': 2.3070812225341797, 'val_acc': 0.10445772111415863}]

In [26]:

history += fit(10, 1e-1, model, train_loader, val_loader)
Epoch [0], val_loss: 1.9987, val_acc: 0.2751
Epoch [1], val_loss: 1.7617, val_acc: 0.3664
Epoch [2], val_loss: 1.6956, val_acc: 0.3943
Epoch [3], val_loss: 1.6709, val_acc: 0.4046
Epoch [4], val_loss: 1.6682, val_acc: 0.3942
Epoch [5], val_loss: 1.5915, val_acc: 0.4323
Epoch [6], val_loss: 1.7013, val_acc: 0.4064
Epoch [7], val_loss: 1.6545, val_acc: 0.4161
Epoch [8], val_loss: 1.5276, val_acc: 0.4652
Epoch [9], val_loss: 1.5144, val_acc: 0.4609

In [27]:

history += fit(10, 1e-2, model, train_loader, val_loader)
Epoch [0], val_loss: 1.4274, val_acc: 0.5000
Epoch [1], val_loss: 1.4257, val_acc: 0.4973
Epoch [2], val_loss: 1.4235, val_acc: 0.4998
Epoch [3], val_loss: 1.4174, val_acc: 0.5011
Epoch [4], val_loss: 1.4125, val_acc: 0.4992
Epoch [5], val_loss: 1.4164, val_acc: 0.5012
Epoch [6], val_loss: 1.4082, val_acc: 0.4998
Epoch [7], val_loss: 1.4069, val_acc: 0.4995
Epoch [8], val_loss: 1.4113, val_acc: 0.4964
Epoch [9], val_loss: 1.4012, val_acc: 0.5031

In [28]:

history += fit(10, 1e-3, model, train_loader, val_loader)
Epoch [0], val_loss: 1.3960, val_acc: 0.5085
Epoch [1], val_loss: 1.3960, val_acc: 0.5087
Epoch [2], val_loss: 1.3957, val_acc: 0.5056
Epoch [3], val_loss: 1.3943, val_acc: 0.5078
Epoch [4], val_loss: 1.3946, val_acc: 0.5085
Epoch [5], val_loss: 1.3939, val_acc: 0.5074
Epoch [6], val_loss: 1.3942, val_acc: 0.5044
Epoch [7], val_loss: 1.3944, val_acc: 0.5082
Epoch [8], val_loss: 1.3931, val_acc: 0.5083
Epoch [9], val_loss: 1.3933, val_acc: 0.5078

In [29]:

history += fit(10, 1e-4, model, train_loader, val_loader)
Epoch [0], val_loss: 1.3930, val_acc: 0.5091
Epoch [1], val_loss: 1.3928, val_acc: 0.5089
Epoch [2], val_loss: 1.3927, val_acc: 0.5091
Epoch [3], val_loss: 1.3927, val_acc: 0.5089
Epoch [4], val_loss: 1.3927, val_acc: 0.5093
Epoch [5], val_loss: 1.3926, val_acc: 0.5091
Epoch [6], val_loss: 1.3926, val_acc: 0.5093
Epoch [7], val_loss: 1.3925, val_acc: 0.5089
Epoch [8], val_loss: 1.3925, val_acc: 0.5089
Epoch [9], val_loss: 1.3925, val_acc: 0.5087

Plot the losses and the accuracies to check if you’re starting to hit the limits of how well your model can perform on this dataset.

In [30]:

plot_losses(history)

In [31]:

plot_accuracies(history)

Finally, evaluate the model on the test dataset report its final performance.

In [32]:

evaluate(model, test_loader)

Out[32]:

{'val_loss': 1.3467928171157837, 'val_acc': 0.5208984613418579}