Author: | Jason Perlow |
Agni: A simple front vs side x-ray classification model

Deep learning has been used extensively to automatically process and classify medical scans. As a contribution to this field we open-source Agni, a simple yet accurate model that automatically determines if a given patient x-ray is facing forwards (frontal) or sideways (lateral). According to Rajkomar et al. this problem is important because metadata formatting for orientation from different x-ray machines can be inconsistent and sometimes even non-existent.
Data
As data we use the open Montgomery County chest x-ray database. Since this data doesn’t have patient orientation information we manually labeled a subset of the data yielding 149 frontal and 101 lateral images. To allow for a neural network with fewer parameters we rescaled x-rays to 128x128 px. Since we have few training samples we synthesised more the data by randomly rotating images in the range of -5 degrees to +5 degrees, shifting them by 0.05 % along their horizontal and vertical axes as well as flipping them along their vertical axis.

How it works
Architecture
Convolutional neural networks (convnets) are a deep learning technique that use a hierarchy of filter banks to extract visual features as an input for a classifier. Structurally Agni is a convnet with four convolutional layers and two dense affine layers. In particular our architecture is based on the widely used VGG model where each model “block” has two convolutional layers with each with the same number of filters followed a pooling (downsampling) layer. We also use batch normalization between layers to allow us to be less careful about initialisation and improve training speed and regularise parameters to reduce over-fitting.
Training
To find weights that extract good features, weights are iteratively adjusted such that Agni best predicts the orientation of a given x-ray and orientation label pair. The extent to which a prediction is correct is measured using a loss function. Since our problem has two classes (frontal and lateral), we use binary cross-entropy loss. To adjust weights such that loss is minimised we use Adam, a modern gradient descent optimiser.
We visualise the training process using tensorboard (as shown below). 100 iterations/epochs of training were sufficient for our model to converge to a near zero valuation loss:
Valuation accuracy over 100 iterations.
Valuation loss over 100 iterations
Note that Agni has a nice steadily decreasing validation loss curve without many upward jumps. Our model took about 23 mins to train on a quad core CPU so our experiments can be replicated by anyone with access to modern consumer grade hardware.
Performance
Experimental results
Our model achieves a 0% false positive rate and 100% true positive rate in the test set of 50 images. On test data Agni achieved a loss of 0.019.
Here is a confusion matrix of front vs side classifications:
True frontal |
True lateral |
|
---|---|---|
Predicted frontal |
21 |
0 |
Predicted lateral |
0 |
29 |
Comparison to related work
Previous work on front vs. side x-ray detection by Rajkomar et al. used pretrained models to achieve near perfect accuracy. Since our work achieves similar results it is comparable to previous work but is much simpler. Our model uses a total of 6 layers and a total of around 650 k parameters compared to previous work using Googlenet which uses 22 layers and a total of 6.7 M parameters. Agni is therefore both more memory and computationally efficient than previous approaches.
Code
Our repo is arranged into source (src
) and data (data
) folders. The src
folder has two files, one for training (train_Agni.py
) and another for testing (test_Agni.py
). Our code is written in python using the Keras 2 deep learning framework. Please refer to the README.md
file in the repo for further details.
Future work
Go ahead and play around with our model parameters. See if you can delete some layers to make the model even simpler.
This project can easily be modified to become an arbitrary binary medical scan classifier (PET scan slides, MRIs slides, …). Feel free to fork this project and classify your own data!