Deep Learning in PyTorch with CIFAR-10 dataset
In this post, we will learn how to build a deep learning model in PyTorch by using the CIFAR-10 dataset.
PyTorch
PyTorch is a Machine Learning Library created by Facebook. It works with tensors, which can be defined as a n-dimension matrix from which you can perform mathematical operations and build Deep Learning Models.
Deep Learning
This subfield of AI seeks to emulate the learning approach that humans use to obtain certain types of knowledge. In its simplest form, deep learning can be seen as a way to automate predictive analytics.
CIFAR-10 Dataset
The CIFAR-10 dataset consists of 60000 32×32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
You can find more information about CIFAR-10 dataset from here.
Deep Learning Model Implementation
For the implementation of this deep learning model, we will go through the following steps:
- Import libraries
- Preparing the data
- Model
- Using a GPU
- Training the model
Import libraries
In [2]:
import torch import torchvision import numpy as np import matplotlib.pyplot as plt import torch.nn as nn import torch.nn.functional as F from torchvision.datasets import CIFAR10 from torchvision.transforms import ToTensor from torchvision.utils import make_grid from torch.utils.data.dataloader import DataLoader from torch.utils.data import random_split %matplotlib inline
Preparing the Data
In [3]:
dataset = CIFAR10(root='data/', download=True, transform=ToTensor()) test_dataset = CIFAR10(root='data/', train=False, transform=ToTensor())
Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to data/cifar-10-python.tar.gz
Extracting data/cifar-10-python.tar.gz to data/
Here, we imported the datasets and converted the images into PyTorch tensors.
In [4]:
classes = dataset.classes classes
Out[4]:
['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
By using the classes method, we can get the image classes from the dataset.
In [5]:
class_count = {} for _, index in dataset: label = classes[index] if label not in class_count: class_count[label] = 0 class_count[label] += 1 class_count
Out[5]:
{'frog': 5000, 'truck': 5000, 'deer': 5000, 'automobile': 5000, 'bird': 5000, 'horse': 5000, 'ship': 5000, 'cat': 5000, 'dog': 5000, 'airplane': 5000}
With this for loop, we can get the number of images per class. It goes through all the dataset, add the class name to a dictionary if it doesn’t exist there yet and counts each image per class.
Now, we’ll split the dataset into two groups: training and validation datasets.
In [6]:
torch.manual_seed(43) val_size = 5000 train_size = len(dataset) - val_size
We used a validation set with 5000 images (10% of the dataset). To ensure we get the same validation set each time, we set PyTorch’s random number generator to a seed value of 43.
In [7]:
train_ds, val_ds = random_split(dataset, [train_size, val_size]) len(train_ds), len(val_ds)
Out[7]:
(45000, 5000)
Here, we used the random_split method to create the training and validations sets.
In [8]:
batch_size=128
In [9]:
train_loader = DataLoader(train_ds, batch_size, shuffle=True, num_workers=4, pin_memory=True) val_loader = DataLoader(val_ds, batch_size*2, num_workers=4, pin_memory=True) test_loader = DataLoader(test_dataset, batch_size*2, num_workers=4, pin_memory=True)
We created dataloaders for training, validation and test sets. We set shuffle=True for the training dataloader, so that the batches generated in each epoch are different, and this randomization helps generalize & speed up the training process. On the other hand, since the validation dataloader is used only for evaluating the model, there is no need to shuffle the images.
Also, we set pin_memory=True because we will push the data from the CPU into the GPU and this parameter lets the DataLoader allocate the samples in page-locked memory, which speeds-up the transfer.
In [10]:
for images, _ in train_loader: print('images.shape:', images.shape) plt.figure(figsize=(16,8)) plt.axis('off') plt.imshow(make_grid(images, nrow=16).permute((1, 2, 0))) break
images.shape: torch.Size([128, 3, 32, 32])
Here, we can visualize a batch of data using the make_grid helper function from Torchvision.
Model
In [12]:
def accuracy(outputs, labels): _, preds = torch.max(outputs, dim=1) return torch.tensor(torch.sum(preds == labels).item() / len(preds))
In [13]:
class ImageClassificationBase(nn.Module): def training_step(self, batch): images, labels = batch out = self(images) # Generate predictions loss = F.cross_entropy(out, labels) # Calculate loss return loss def validation_step(self, batch): images, labels = batch out = self(images) # Generate predictions loss = F.cross_entropy(out, labels) # Calculate loss acc = accuracy(out, labels) # Calculate accuracy return {'val_loss': loss.detach(), 'val_acc': acc} def validation_epoch_end(self, outputs): batch_losses = [x['val_loss'] for x in outputs] epoch_loss = torch.stack(batch_losses).mean() # Combine losses batch_accs = [x['val_acc'] for x in outputs] epoch_acc = torch.stack(batch_accs).mean() # Combine accuracies return {'val_loss': epoch_loss.item(), 'val_acc': epoch_acc.item()} def epoch_end(self, epoch, result): print("Epoch [{}], val_loss: {:.4f}, val_acc: {:.4f}".format(epoch, result['val_loss'], result['val_acc']))
In [14]:
def evaluate(model, val_loader): outputs = [model.validation_step(batch) for batch in val_loader] return model.validation_epoch_end(outputs) def fit(epochs, lr, model, train_loader, val_loader, opt_func=torch.optim.SGD): history = [] optimizer = opt_func(model.parameters(), lr) for epoch in range(epochs): # Training Phase for batch in train_loader: loss = model.training_step(batch) loss.backward() optimizer.step() optimizer.zero_grad() # Validation phase result = evaluate(model, val_loader) model.epoch_end(epoch, result) history.append(result) return history
Using a GPU
In [15]:
torch.cuda.is_available()
Out[15]:
False
In [16]:
def get_default_device(): """Pick GPU if available, else CPU""" if torch.cuda.is_available(): return torch.device('cuda') else: return torch.device('cpu')
In [17]:
device = get_default_device() device
Out[17]:
device(type='cpu')
In [18]:
def to_device(data, device): """Move tensor(s) to chosen device""" if isinstance(data, (list,tuple)): return [to_device(x, device) for x in data] return data.to(device, non_blocking=True) class DeviceDataLoader(): """Wrap a dataloader to move data to a device""" def __init__(self, dl, device): self.dl = dl self.device = device def __iter__(self): """Yield a batch of data after moving it to device""" for b in self.dl: yield to_device(b, self.device) def __len__(self): """Number of batches""" return len(self.dl)
In [19]:
def plot_losses(history): losses = [x['val_loss'] for x in history] plt.plot(losses, '-x') plt.xlabel('epoch') plt.ylabel('loss') plt.title('Loss vs. No. of epochs');
In [20]:
def plot_accuracies(history): accuracies = [x['val_acc'] for x in history] plt.plot(accuracies, '-x') plt.xlabel('epoch') plt.ylabel('accuracy') plt.title('Accuracy vs. No. of epochs');
In [21]:
train_loader = DeviceDataLoader(train_loader, device) val_loader = DeviceDataLoader(val_loader, device) test_loader = DeviceDataLoader(test_loader, device)
Training the Model
In [22]:
input_size = 3*32*32 output_size = 10
In [23]:
class CIFAR10Model(ImageClassificationBase): def __init__(self): super().__init__() self.linear1 = nn.Linear(input_size, 256) self.linear2 = nn.Linear(256, 128) self.linear3 = nn.Linear(128, output_size) def forward(self, xb): # Flatten images into vectors out = xb.view(xb.size(0), -1) # Apply layers & activation functions out = self.linear1(out) out = F.relu(out) out = self.linear2(out) out = F.relu(out) out = self.linear3(out) return out
In [24]:
model = to_device(CIFAR10Model(), device)
In [25]:
history = [evaluate(model, val_loader)] history
Out[25]:
[{'val_loss': 2.3070812225341797, 'val_acc': 0.10445772111415863}]
In [26]:
history += fit(10, 1e-1, model, train_loader, val_loader)
Epoch [0], val_loss: 1.9987, val_acc: 0.2751 Epoch [1], val_loss: 1.7617, val_acc: 0.3664 Epoch [2], val_loss: 1.6956, val_acc: 0.3943 Epoch [3], val_loss: 1.6709, val_acc: 0.4046 Epoch [4], val_loss: 1.6682, val_acc: 0.3942 Epoch [5], val_loss: 1.5915, val_acc: 0.4323 Epoch [6], val_loss: 1.7013, val_acc: 0.4064 Epoch [7], val_loss: 1.6545, val_acc: 0.4161 Epoch [8], val_loss: 1.5276, val_acc: 0.4652 Epoch [9], val_loss: 1.5144, val_acc: 0.4609
In [27]:
history += fit(10, 1e-2, model, train_loader, val_loader)
Epoch [0], val_loss: 1.4274, val_acc: 0.5000 Epoch [1], val_loss: 1.4257, val_acc: 0.4973 Epoch [2], val_loss: 1.4235, val_acc: 0.4998 Epoch [3], val_loss: 1.4174, val_acc: 0.5011 Epoch [4], val_loss: 1.4125, val_acc: 0.4992 Epoch [5], val_loss: 1.4164, val_acc: 0.5012 Epoch [6], val_loss: 1.4082, val_acc: 0.4998 Epoch [7], val_loss: 1.4069, val_acc: 0.4995 Epoch [8], val_loss: 1.4113, val_acc: 0.4964 Epoch [9], val_loss: 1.4012, val_acc: 0.5031
In [28]:
history += fit(10, 1e-3, model, train_loader, val_loader)
Epoch [0], val_loss: 1.3960, val_acc: 0.5085 Epoch [1], val_loss: 1.3960, val_acc: 0.5087 Epoch [2], val_loss: 1.3957, val_acc: 0.5056 Epoch [3], val_loss: 1.3943, val_acc: 0.5078 Epoch [4], val_loss: 1.3946, val_acc: 0.5085 Epoch [5], val_loss: 1.3939, val_acc: 0.5074 Epoch [6], val_loss: 1.3942, val_acc: 0.5044 Epoch [7], val_loss: 1.3944, val_acc: 0.5082 Epoch [8], val_loss: 1.3931, val_acc: 0.5083 Epoch [9], val_loss: 1.3933, val_acc: 0.5078
In [29]:
history += fit(10, 1e-4, model, train_loader, val_loader)
Epoch [0], val_loss: 1.3930, val_acc: 0.5091 Epoch [1], val_loss: 1.3928, val_acc: 0.5089 Epoch [2], val_loss: 1.3927, val_acc: 0.5091 Epoch [3], val_loss: 1.3927, val_acc: 0.5089 Epoch [4], val_loss: 1.3927, val_acc: 0.5093 Epoch [5], val_loss: 1.3926, val_acc: 0.5091 Epoch [6], val_loss: 1.3926, val_acc: 0.5093 Epoch [7], val_loss: 1.3925, val_acc: 0.5089 Epoch [8], val_loss: 1.3925, val_acc: 0.5089 Epoch [9], val_loss: 1.3925, val_acc: 0.5087
Plot the losses and the accuracies to check if you’re starting to hit the limits of how well your model can perform on this dataset.
In [30]:
plot_losses(history)
In [31]:
plot_accuracies(history)
Finally, evaluate the model on the test dataset report its final performance.
In [32]:
evaluate(model, test_loader)
Out[32]:
{'val_loss': 1.3467928171157837, 'val_acc': 0.5208984613418579}