Install the tree dependency in case you don’t have it so we can view our directory structure (sudo apt install tree).
Looks like we have two folders which contain images of cells which are infected and healthy.
We can get further detail of the total number of images using the following code.
Looks like we have a balanced dataset of 13779 malaria and non-malaria (uninfected) cell images.
Let’s build a dataframe from this which will be of use to us shortly as we start building our datasets.
Build and Explore Image DatasetsTo build deep learning models we need training data but we also need to test the model’s performance on unseen data.
We will use a 60:10:30 split for train, validatation and test datasets respectively.
We will leverage the train and validation datasets during training and check the performance of the model on the test dataset.
Now obviously the images will not be of equal dimensions given blood smears and cell images will vary based on the human, the test method and the orientation in which the photo was taken.
Let’s get some summary statistics of our training dataset to decide optimal image dimensions (remember we don’t touch the test dataset at all!).
We apply parallel processing to speed up the image read operations and based on the summary statistics, we have decided to resize each image to 125×125 pixels.
Let’s load up all our images and resize them to these fixed dimensions.
We leverage parallel processing again to speed up computations pertaining to image load and resizing.
Finally we get our image tensors of desired dimensions as depicted in the preceding output.
We can now view some sample cell images to get an idea of how our data looks like.
Based on the sample images above, we can notice some subtle differences between malaria and healthy cell images.
We will basically make our deep learning models try and learn these patterns during model training.
We setup some basic configuration settings before we start training our models.
We fix our image dimensions, batch size, epochs and encode our categorical class labels.
The alpha version of TensorFlow 2.
0 was released on March, 2019 just a couple of weeks before this article was written and it gives us a perfect excuse to try it out!Deep Learning Model Training PhaseIn the model training phase, we will build several deep learning models and train them on our training data and compare their performance on the validation data.
We will then save these models and use them later on again in the model evaluation phase.
Model 1: CNN from ScratchOur first malaria detection model will be building and training a basic convolutional neural network (CNN) from scratch.
First let’s define our model architecture.
Based on the architecture in the preceding code, our CNN model has three convolution and pooling layers followed by two dense layers and dropout for regularization.
Let’s train our model now!We get a validation accuracy of 95.
6% which is pretty good, though our model looks to be overfitting slightly looking at our training accuracy which is 99.
We can get a clear perspective on this by plotting the training and validation accuracy and loss curves.
Learning Curves for Basic CNNThus we can see after the fifth epoch, things don’t seem to improve a whole lot overall.
Let’s save this model for future evaluation.
h5')Deep Transfer LearningJust like humans have an inherent capability of being able to transfer knowledge across tasks, transfer learning enables us to utilize knowledge from previously learned tasks and apply them to newer, related ones even in the context of machine learning or deep learning.
A comprehensive coverage of transfer learning is available in my article and my book for readers interested in doing a deep-dive.
Ideas for deep transfer learningFor the purpose of this article, the idea is, can we leverage a pre-trained deep learning model (which was trained on a large dataset — like ImageNet) to solve the problem of malaria detection by applying and transferring its knowledge in the context of our problem?We will apply the two most popular strategies for deep transfer learning.
Pre-trained Model as a Feature ExtractorPre-trained Model with Fine-tuningWe will be using the pre-trained VGG-19 deep learning model, developed by the Visual Geometry Group (VGG) at the University of Oxford, for our experiments.
A pre-trained model like the VGG-19 is an already pre-trained model on a huge dataset (ImageNet) with a lot of diverse image categories.
Considering this fact, the model should have learned a robust hierarchy of features, which are spatial, rotation, and translation invariant with regard to features learned by CNN models.
Hence, the model, having learned a good representation of features for over a million images, can act as a good feature extractor for new images suitable for computer vision problems just like malaria detection!.Let’s briefly discuss the VGG-19 model architecture before unleashing the power of transfer learning on our problem.
Understanding the VGG-19 modelThe VGG-19 model is a 19-layer (convolution and fully connected) deep learning network built on the ImageNet database, which is built for the purpose of image recognition and classification.
This model was built by Karen Simonyan and Andrew Zisserman and is mentioned in their paper titled ‘Very Deep Convolutional Networks for Large-Scale Image Recognition’.
I recommend all interested readers to go and read up on the excellent literature in this paper.
The architecture of the VGG-19 model is depicted in the following figure.
VGG-19 Model ArchitectureYou can clearly see that we have a total of 16 convolution layers using 3 x 3convolution filters along with max pooling layers for downsampling and a total of two fully connected hidden layers of 4096 units in each layer followed by a dense layer of 1000 units, where each unit represents one of the image categories in the ImageNet database.
We do not need the last three layers since we will be using our own fully connected dense layers to predict malaria.
We are more concerned with the first five blocks, so that we can leverage the VGG model as an effective feature extractor.
For one of the models, we will use it as a simple feature extractor by freezing all the five convolution blocks to make sure their weights don’t get updated after each epoch.
For the last model, we will apply fine-tuning to the VGG model, where we will unfreeze the last two blocks (Block 4 and Block 5) so that their weights get updated in each iteration (per batch of data) as we train our own model.
Model 2: Pre-trained Model as a Feature ExtractorFor building this model, we will leverage TensorFlow to load up the VGG-19 model, and freeze the convolution blocks so that we can use it as an image feature extractor.
We will plugin our own dense layers at the end for performing the classification task.
Thus it is quite evident from the preceding output that we have a lot of layers in our model and we will be using the frozen layers of the VGG-19 model as feature extractors only.
You can use the following code to verify how many layers in our model are indeed trainable and how many total layers are present in our network.
We will now train our model using similar configurations and callbacks which we used in our previous model.
Refer to my GitHub repository for the complete code to train the model.
We observe the following plots showing the model’s accuracy and loss.
Learning Curves for frozen pre-trained CNNThis shows us that our model is not overfitting as much as our basic CNN model but the performance is not really better and in fact is sligtly lesser than our basic CNN model.
Let’s save this model now for future evaluation.
h5')Model 3: Fine-tuned Pre-trained Model with Image AugmentationIn our final model, we will fine-tune the weights of the layers present in the last two blocks of our pre-trained VGG-19 model.
Besides that, we will also introduce the concept of image augmentation.
The idea behind image augmentation is exactly as the name sounds.
We load in existing images from our training dataset and apply some image transformation operations to them, such as rotation, shearing, translation, zooming, and so on, to produce new, altered versions of existing images.
Due to these random transformations, we don’t get the same images each time.
We will leverage an excellent utility called ImageDataGenerator in tf.
keras that can help us build image augmentors.
We do not apply any transformations on our validation dataset except scaling the images (which is mandatory), since we will be using it to evaluate our model performance per epoch.
For detailed explanation of image augmentation in the context of transfer learning feel free to check out my article if needed.
Let's take a look at some sample results from a batch of image augmentation transforms.
Sample Augmented ImagesYou can clearly see the slight variations of our images in the preceding output.
We will now build our deep learning model making sure the last two blocks of the VGG-19 model is trainable.
We reduce the learning rate in our model since we don’t want to make to large weight updates to the pre-trained layers when fine-tuning.
The training process of this model will be slightly different since we are using data generators and hence we will be leveraging the fit_generator(…) function.
This looks to be our best model yet giving us a validation accuracy of almost 96.
5% and based on the training accuracy, it doesn’t look like our model is overfitting as much as our first model.
This can be verified with the following learning curves.
Learning Curves for fine-tuned pre-trained CNNLet’s save this model now so that we can use it for model evaluation on our test dataset shortly.
h5')This completes our model training phase and we are now ready to test the performance of our models on the actual test dataset!Deep Learning Model Performance Evaluation PhaseWe will now evaluate the three different models that we just built in the training phase by making predictions with them on the data from our test dataset, because just validation is not enough!.We have also built a nifty utility module called model_evaluation_utils, which we will be using to evaluate the performance of our deep learning models with relevant classification metrics.
The first step here is to obviously scale our test data.
The next step involves loading up our saved deep learning models and making predictions on the test data.
The final step is to leverage our model_evaluation_utils module and check the performance of each model with relevant classification metrics.
Looks like our third model performs the best out of all our three models on the test dataset giving a model accuracy as well as f1-score of 96% which is pretty good and quite comparable to the more complex models mentioned in the research paper and articles we mentioned earlier!ConclusionWe looked at an interesting real-world medical imaging case study of malaria detection in this article.
Malaria detection by itself is not an easy procedure and the availability of the right personnel across the globe is also a serious concern.
We looked at easy to build open-source techniques leveraging AI which can give us state-of-the-art accuracy in detecting malaria thus enabling AI for social good.
I encourage everyone to check out the articles and research papers mentioned in this article, without which it would have been impossible for me to conceptualize and write this article.
Let’s hope for more adoption of open-source AI capabilities across healthcare making it cheaper and accessible for everyone across the world!This article has been adapted from my own article published previously in opensource.
comIf you are interested in running or adopting all the code used in this article, it is available on my GitHub repository.
Remember to download the data from the official website.
.. More details