While I am certainly getting answers, I am also curious in finding that missing feature or piece of data to improve the model.
There is an important caveat.
We are now at the point where the model is teaching us about the data.
Sometimes we can get stuck in a mindset where the output is the end of the process.
If we fall into that trap, we might miss a fantastic opportunity to create a positive feedback loop.
30-second powerpoint drawingTherefore, we are sitting a little wiser and little more confident in the 4th phase.
Given this data what decisions should I make:More trainingMore imagesMore powerful architectureHowever, I am going to look at a different dataset.
Let’s get up and personal with endoscope images of people’s insides.
Get the dataset, see a whole lot of sh… stuffFor anyone else interested in gastroenterology I recommend looking into The Kvasir Dataset.
A good description from their site is:the dataset containing images from inside the gastrointestinal (GI) tract.
The collection of images are classified into three important anatomical landmarks and three clinically significant findings.
In addition, it contains two categories of images related to endoscopic polyp removal.
Sorting and annotation of the dataset is performed by medical doctors (experienced endoscopists)Perfect, now that we have a dataset.
A dataset that contains stool but something exciting and complete.
There are 8 classes in this dataset for us to classify.
When you download the data you start with something like this:Depending on the data, there are many ways we could format these folders.
ai has many commands for the input, I chose to use the from_folder command and set up the data like below.
We are moving lots of images, so I would recommend scripting (a code example is in the notebook) and to go through these steps.
Step 1: create the train and valid folders.
Step 2: move all of the folders to the new train.
Step 3: create an identically named folder for each of the folders (classes) in the train folder.
Step 4: move 10% of the images over to the corresponding class folder inside the valid folder.
It should look exactly like the image above (except models and tests).
Now that is perfect we can go back and set up the notebook.
Diving into the notebookYou can pick up my Jupyter notebook from GitHub here.
Transforms: Getting the most of an imageThe first major decision we have to make is how to handle transforms for images.
If we make random changes to an image (rotate, change color, flip, etc.
) we can make it seem like we have more images to train from and we are less likely to overfit.
Is it as good as getting more images?.No.
However, it is much quicker.
When choosing which transforms to use we want something that makes sense.
Here are some examples of normal transforms of the same image if we were looking at dog breeds.
If any of these individually came into the dataset, we would think it makes sense, and now we have 8 images instead of 1.
What if in the transformation madness we go too far?.We could get the images below that are a little too extreme.
We wouldn’t want to use many of these because they are not clear and do not correctly orient in a direction we would expect data to come in.
While a dog could be tilted, it would never be upside down.
For the endoscope images, we are not as concerned about it being upside down or over tilted.
An endoscope goes all over the place and can have a 360-degree rotation here, so I went wild with rotational transforms.
Even a bit with the color as the lighting inside the body would be different.
All of these seem to be in the realm of possibility.
Example of dyed polyps(note: the green box denotes how far the scope traveled.
Therefore, this technique might be cutting off the value that could have provided.
)The ResnetsNowadays Resnet is popularly used for image classification.
It has a number after it which equates to the number of layers.
Many better articles exist about Resnet, therefore, to simplify for this article:More layers = more accurate (Hooray!)More layers = more compute and time needed (Boo.
)Therefore Resnet34 has 34 layers of image finding goodness.
The important thing right now is that it works and it is the fastest.
Let’s look at some code:We see that after the cycles and 7 minutes we get to 87% accuracy.
Not bad at all.
Not being a doctor, I have a very untrained eye looking at these.
I have no clue what to be looking for, categorization errors, or if the data is any good.
So I went straight to the confusion matrix.
Of the 8 classes, 2 sets of 2 are often confused with each other.
As a baseline, I could only see if they are dyed, polyps, or something else.
So compared to my personal baseline of 30% accuracy the machine is getting an amazing 87%.
After looking at the images from these 2 sets side by side, you can see why.
(Being medical images they might be NSFW and are present in the Jupyter notebook)The dyed sections are being confused with each other.
This type of error can be expected.
They are both blue and look very similar to each other.
Esophagitis is hard to tell from a normal Z-line.
Perhaps esophagitis presents redder than Z-line?.I’m not certain.
(Note: check the outliers in the confusion matrix, I am only classifying 1 class from the images.
There might be more than one solution, like a dyed section may also contain a polyp.
More about multi-classification as next steps at the end.
)More layers, more images, more power!We start built for speed.
Using a smaller dataset, few transformations, fewer epochs, and a faster architecture we were able to see if we are on the right path.
Now that we see our super fast model worked let’s switch over to the powerhouse.
I increased the size of the dataset from v1 to v2.
The larger set doubles the number of images available from 4000 to 8000.
(note: All examples show v2.
)Transform everything that makes sense.
There are lots of things you can tweak.
Since the images from the dataset are relatively large, I decided to try making the size bigger.
Although this would be slower, I was curious if it would be better able to pick out little details.
This hypothesis still requires some experimentation.
More and more epochs.
If you remember from before Resnet 50 would have more layers but require more compute and therefore be slower.
So we change the model from Resnet34 to Resnet50.
It really is that easyThen we are ready to fire!Many epochs later…93% accurate!.Not that bad, let’s look a the confusion matrix again.
It looks like the problem with dyed classification has gone away, but the esophagitis errors remain.
In fact, the numbers of errors get worse in some of my iterations.
Conclusion and Follow-up:It is very easy to transfer the new course from Fast.
ai to a different dataset.
Much more accessible than ever before.
When going through testing make sure you start with a fast concept to make sure everything is on the right path, then turn up the power later.
Create a positive feedback loop to make sure you are both oriented correctly and as a mechanism to force you to learn more about the dataset.
You will have a much richer experience in doing so.
Some observations on this dataset.
Some of these classifications can benefit from a feature describing how far the endoscope is in the body.
Major landmarks in the body would help to classify the images.
The small green box on the bottom left of the images is a map describing where the endoscope is and might be a useful feature to explore.
I think some of these images classifications overlap and is causing some misclassification.
For example, I would estimate that the dyed areas often have polyps in them too.
If you haven’t seen the new fast.
ai course take a look, it took me more time to write this post than it did to code the program, it was that simple.
ResourcesGithub NotebookKvasir DatasetFastAIPyTorchYoutube video on this topic.