Animal Classification

Animal Classification

This project focuses on using a deep learning model (Resnet 18) to classify images of cats and dogs into one of 37 distinct breeds using the Oxford-IIIT Pet Dataset. The project aims to improve my skills using deep learning models and also in the long term create a web application that uses my trained model that users can use to identify the breed of their pets. I also aim to get an insight into the models interpretability through confusion matrices and performance metrics such as precision, recall, and F1 score.

Dataset

Image Preprocessing

All training and test images have been transformed but training images have received some augmentation.

Training Augmentation

Testing No Augmentation

ResNet18

For this project, I used ResNet18 as the baseline model. Although ResNet18 is often used with pretrained weights on large datasets like ImageNet, in this case, it was initialised with pretrained weights but further trained (fine-tuned) on the Oxford-IIT Pet Dataset to adapt the model to the specific task of classifying 37 cat and dog breeds.

Why ResNet18

ResNet18 was chosen for its efficiency and simplicity, particularly when training on moderately sized datasets. It offers a good balance between computational cost and performance, making it ideal for experimentation and comparison in resource-constrained environments.

Metrics

To evaluate the performance of the classification models, several key metrics were used: accuracy, precision, recall, and F1 score. Each of these metrics provides a different perspective on how well the model performs, especially in multi-class classification tasks like the Oxford-IIIT Pet Dataset.

Accuracy

Accuracy means the proportion of correctly predicted instances out of the total number of predictions. It is a general measure of performance and is easy to interpret. The model achieved an accuracy of 0.8302, which means approximately 83.02% of the test images were classified into the correct breed.

Precision (Weighted)

Precision indicates how many of the predicted positive classes were actually correct. The weighted precision accounts for class imbalance by computing the precision for each class and averaging them, weighted by the number of true instances per class. The model achieved a precision score of 0.8398, meaning when the model predicted a certain breed, it was correct about 83.98% of the time, on average, across all classes.

Recall (Weighted)

Recall measures how many of the actual positive instances were correctly predicted by the model. Like precision, the weighted version accounts for class imbalance. The ResNet18 model achieved a recall score of 0.8302, meaning the model correctly identified 83.02% of all true instances of each breed, on average.

F1 Score (Weighted)

The F1 Score is a harmonic mean of precision and recall. It provides a balance between both metrics, especially when there is a trade-off between false positives and false negatives. The weighted F1 score ensures that all classes are proportionally represented. The model achieved a score of 0.8303, which is indicative of a well-balanced performance between precision and recall.

Visualisation

Graph showing performance metrics of ResNet18 on training data.

Confusion Matrix of predicted labels of training images vs the predicted label.

Real-World Applicaiton Potential

The findings of this project have practical implications for real-world applications such as:

Future Work

To improve accessibility and usability of the model, I am currently working on deploying the model as a web app using AWS services. This will allow users to upload pet images and receive real-time breed predictions through a web interface. AWS services such as S3, Lambda, and API Gateway.