Garbage Classification using CNN Model
Introduction
Convolutional Neural Network is one of the important algorithms of Deep Learning which is widely used for image recognization. Just like how our brain recognizes an object based on its features, a CNN network processes the image and identifies the object based on its features. CNN is gaining lot of popularity these days over the ANN because of results/accuracy is better compared to the ANN. CNN takes in only the pixels of the images that has features required to identify the image and thus reduces the image size. We also add in something called “filters/kernels” which increases the depth of the images.
If you are confused by the concepts discussed above, I strongly recommend going through the following:
Deep Learning: https://ainewgeneration.com/category/deep-learning/
Youtube Tutorials: https://www.youtube.com/channel/UCZUUitv-us8zJ0_JyKM-bqg/videos
Table of Contents
- Introduction
- About the Garbage Dataset
- Importing the Libraries
- Loading the Dataset
- Visualizing the Dataset
- Creating the Model
- Training the Model
- Visualizing the Results
- Conclusion
About the Garbage Dataset
This dataset comprises of random images of garbage which are classified by folders. The names of each of these folders are going to be our labels of the images. We use the library “ImageFolder” to load these images into our workspace. Our aim is to make our model identify the images of, cardboard, glass, paper, metal, plastic, trash.
Here is how our dataset looks like:

Importing Libraries
We will require the following libraries to build our CNN Model. Will will discuss them in detail while we use them but for now, import the following:
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, random_split
import torchvision
import torchvision.transforms as transforms
from torchvision.datasets import ImageFolder
from torchvision.utils import make_grid
Loading the Dataset
First, we start by giving the data path of the dataset in your computer. I have given my data path as an example but make sure you give the write data path and in the second step, define the size of the image you want all the images of your dataset to be.
data_path = r"/Users/pranavi/Downloads/Data Science/Garbage Data/Garbage classification/Garbage classification"
img_size = 120
Transforming the images before importing is critical and we do the following steps:
- We resize the images into our desired shape i.e, 120*120
- RandomHorizontalFlip is a part of Data Augmentation. This creates new set of images by flipping which ultimately turns our model smarter.
- ToTensor(): transforms our images into tensor form
- We will have to normalize our images by giving the said mean and standard deviation values.
img_transform = transforms.Compose([
transforms.Resize((img_size, img_size)),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485,0.456,0.406],
std=[0.229,0.224,0.225])])
As said above, our data is labelled by the folders the images are in. So, we use ImageFolder Library to load the data into our workspace and then split it into training, testing and validation datasets using the random_split.
img_data = ImageFolder(root=data_path, transform=img_transform)
input: img_data.class_to_idx
output:
{'cardboard': 0, 'glass': 1, 'metal': 2, 'paper': 3, 'plastic': 4, 'trash': 5}
input: len(img_data)
output: 2527
train_data, val_data, test_data = random_split(img_data,[1800,627,100])
output: len(train_data)
input: 1800
To not overload our model, we use DataLoader to divide our dataset into batches. This helps us with the memory management.
train_loader = DataLoader(train_data, batch_size=32, shuffle=True)
val_loader = DataLoader(val_data, batch_size=32, shuffle=False)
input: for images, labels in train_loader:
print(images.shape, labels.shape)
break
output:
torch.Size([32, 3, 120, 120]) torch.Size([32])
Visualizing the Dataset
To visualize the images in our dataset, we create a function so that we can use it on both of our train_loader, val_loader
def show_img_batch(data):
for images, labels in data:
plt.figure(figsize=(20,15))
plt.imshow(make_grid(images, nrow=16).permute(1,2,0))
plt.show()
break
input: show_img_batch(train_loader)

input: show_img_batch(val_loader)

Creating our CNN Model
It is time to create our CNN model. For our data, I have used 3 convolutional layers and 3 fully connected layers. To determine the image size in the fc1, I have used n-f+2p/s +1 formula where n = image size, f = number of kernels, p = padding, s = stride for every convolutional layer.
class CNN(nn.Module):
def __init__(self, kernel_size=3, out_channels1=64, out_channels2=128, out_channels3=256, out_channels4=352,
out_channels5 = 512):
super(CNN, self).__init__()
self.features = nn.Sequential(
nn.Conv2d(in_channels=3, out_channels=out_channels1, kernel_size=kernel_size), #120-2=118
nn.ReLU(),
nn.MaxPool2d(2),#59
nn.Conv2d(in_channels= out_channels1, out_channels=out_channels2, kernel_size=kernel_size),
nn.ReLU(), #59-2=57
nn.MaxPool2d(2), #28
nn.Conv2d(in_channels=out_channels2, out_channels=out_channels3, kernel_size=kernel_size),
nn.ReLU(), #26
nn.MaxPool2d(2), #13
nn.Conv2d(in_channels=out_channels3, out_channels=out_channels4, kernel_size=kernel_size),
nn.ReLU(), #11
nn.MaxPool2d(2), #5
nn.Conv2d(in_channels=out_channels4, out_channels=out_channels5, kernel_size=kernel_size),
nn.ReLU(), #3
nn.MaxPool2d(2))#1
self.classifier = nn.Sequential(
nn.Linear(512, 256),
nn.ReLU(),
nn.Dropout(0.1),
nn.Linear(256,128),
nn.ReLU(),
nn.Dropout(0.1),
nn.Linear(128,6))
def forward(self,x):
x = self.features(x)
x = x.view(-1, 512)
x = self.classifier(x)
return x
input: model = CNN()
print(model.parameters)
output:
<bound method Module.parameters of CNN(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1))
(1): ReLU()
(2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(3): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1))
(4): ReLU()
(5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(6): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1))
(7): ReLU()
(8): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(9): Conv2d(256, 352, kernel_size=(3, 3), stride=(1, 1))
(10): ReLU()
(11): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(12): Conv2d(352, 512, kernel_size=(3, 3), stride=(1, 1))
(13): ReLU()
(14): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(classifier): Sequential(
(0): Linear(in_features=512, out_features=256, bias=True)
(1): ReLU()
(2): Dropout(p=0.1, inplace=False)
(3): Linear(in_features=256, out_features=128, bias=True)
(4): ReLU()
(5): Dropout(p=0.1, inplace=False)
(6): Linear(in_features=128, out_features=6, bias=True)
)
)>
We need to define the loss function we want to use for the model and also the optimizer from torch.optim
loss_fn = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
Training our Model
Now that we have our model created, we now write a function that will train, test, and plot the data. I have iterated the data 20 times and we are using Adam as our optimizer.
def CNN_train(loss_fn, optimizer):
epochs = 20
training_loss = []
training_acc = []
testing_loss = []
testing_acc = []
for epoch in range(epochs):
train_acc = 0.0
train_loss = 0.0
model.train()
for images, labels in train_loader:
optimizer.zero_grad()
output = model(images)
loss = loss_fn(output, labels)
loss.backward()
optimizer.step()
predictions = torch.argmax(output,1)
train_acc += (predictions==labels).sum().item()
train_loss += loss.item()
training_acc.append(train_acc/len(train_data))
training_loss.append(train_loss/len(train_loader))
model.eval()
test_acc = 0.0
test_loss = 0.0
with torch.no_grad():
for images, labels in val_loader:
output = model(images)
loss = loss_fn(output, labels)
predictions = torch.argmax(output,1)
test_acc += (predictions==labels).sum().item()
test_loss += loss.item()
testing_acc.append(test_acc/len(val_data))
testing_loss.append(test_loss/len(val_loader))
print("Epochs:{},Training Accuracy:{:.2f},Training Loss:{:.2f},Validation Accuracy:{:.2f},Validation Loss:{:.2f}.".
format(epoch+1, train_acc/len(train_data), train_loss/len(train_loader), test_acc/len(val_data),
test_loss/len(val_loader)))
plt.title("Accuracy Vs Epohcs")
plt.plot(range(epochs), training_acc, label="Training Accuracy")
plt.plot(range(epochs), testing_acc, label="Validation Accuracy")
plt.legend()
plt.xlabel("Epochs")
plt.ylabel("Accuracy")
plt.show()
plt.title("Loss Vs Epochs")
plt.plot(range(epochs), testing_loss, label="Validation Loss")
plt.plot(range(epochs), training_loss, label = "Training Loss")
plt.legend()
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.show()
input: CNN_train(loss_fn, optimizer)
output:
Epochs:1,Training Accuracy:0.23,Training Loss:1.71,Validation Accuracy:0.24,Validation Loss:1.68.
Epochs:2,Training Accuracy:0.29,Training Loss:1.63,Validation Accuracy:0.31,Validation Loss:1.68.
Epochs:3,Training Accuracy:0.35,Training Loss:1.55,Validation Accuracy:0.41,Validation Loss:1.48.
Epochs:4,Training Accuracy:0.43,Training Loss:1.41,Validation Accuracy:0.48,Validation Loss:1.38.
Epochs:5,Training Accuracy:0.45,Training Loss:1.40,Validation Accuracy:0.47,Validation Loss:1.35.
Epochs:6,Training Accuracy:0.48,Training Loss:1.34,Validation Accuracy:0.54,Validation Loss:1.28.
Epochs:7,Training Accuracy:0.53,Training Loss:1.24,Validation Accuracy:0.56,Validation Loss:1.18.
Epochs:8,Training Accuracy:0.53,Training Loss:1.20,Validation Accuracy:0.55,Validation Loss:1.18.
Epochs:9,Training Accuracy:0.56,Training Loss:1.16,Validation Accuracy:0.56,Validation Loss:1.16.
Epochs:10,Training Accuracy:0.58,Training Loss:1.10,Validation Accuracy:0.60,Validation Loss:1.12.
Epochs:11,Training Accuracy:0.59,Training Loss:1.08,Validation Accuracy:0.55,Validation Loss:1.20.
Epochs:12,Training Accuracy:0.62,Training Loss:1.00,Validation Accuracy:0.59,Validation Loss:1.17.
Epochs:13,Training Accuracy:0.66,Training Loss:0.92,Validation Accuracy:0.65,Validation Loss:1.02.
Epochs:14,Training Accuracy:0.67,Training Loss:0.89,Validation Accuracy:0.63,Validation Loss:1.03.
Epochs:15,Training Accuracy:0.69,Training Loss:0.82,Validation Accuracy:0.67,Validation Loss:1.01.
Epochs:16,Training Accuracy:0.72,Training Loss:0.80,Validation Accuracy:0.66,Validation Loss:1.04.
Epochs:17,Training Accuracy:0.70,Training Loss:0.82,Validation Accuracy:0.69,Validation Loss:0.98.
Epochs:18,Training Accuracy:0.75,Training Loss:0.72,Validation Accuracy:0.67,Validation Loss:0.98.
Epochs:19,Training Accuracy:0.77,Training Loss:0.65,Validation Accuracy:0.67,Validation Loss:1.08.
Epochs:20,Training Accuracy:0.78,Training Loss:0.63,Validation Accuracy:0.68,Validation Loss:0.98.


Visualizing the Results
def predict_img(img,model):
x = img.unsqueeze(0)
y = model(x)
prediction = torch.argmax(y, dim=1)
return img_data.classes[prediction]
input:
img, label = test_data[10]
plt.imshow(img.permute(1,2,0))
print("Actual Label", img_data.classes[label], "Predicted Label", predict_img(img, model))
output:
Actual Label plastic Predicted Label plastic

input:
img, label = test_data[64]
plt.imshow(img.permute(1,2,0))
print("Actual Label", img_data.classes[label], "Predicted Label", predict_img(img, model))
output:

Conclusion
Our model achieved 68 percent of accuracy with the model we have built. To be honest, the images in our dataset are difficult for our naked eye to differentiate which is why our model’s accuracy is not what we usually expect. This blog is only to explain how to build a basic CNN model and if you want to attain exceptional results on this dataset, try building and complex model and play around with the numbers.
Tag:#CNN, #deeplearning