YOLO Object Detection in MATLAB, Start to FinishDownloading and implementing the YOLO object detection network in MATLABJames BrowningBlockedUnblockFollowFollowingJan 7Joseph Redmond’s YOLO algorithm caught my attention when I was looking for a way to rapidly count biological cells in a 3D printed skin organoid.
The YOLO algorithm has the advantage of being capable of recognizing and locating multiple (up to 49 in my implementation) objects in a single image, which makes it an ideal framework for counting cells in microscope images.
However, before I was able to train a YOLO-like network for cell detection, I needed to implement the original YOLO in MATLAB which I am using for this project.
What I thought would be a fairly straightforward task ended up being a bit of an exercise in reverse engineering.
Although YOLO is available to download from Mathworks, few details of the implementation are available.
If you are interested in object detection in Matlab (and have the appropriate toolboxes), this article provides a recipe along with some insight into the behavior and use of YOLO.
If you are completely new to YOLO, here is the original YOLO paper followed by a great description by Andrew Ng to get you started.
Download YOLONET and modify for regressionThe YOLO network that is available from Mathworks requires modification before it can be used for object detection.
The network you will download contains final layers for a classification algorithm; a classification layer and a softmax layer.
However, YOLO is actually structured as a CNN regression algorithm.
The output should be an array or vector of numbers between 0 and 1 which encode probabilities and bounding box information for objects detected in an image rather than a series of 1 and 0's.
The last two layers need to be replaced with a single regression layer.
In addition, for concordance with the original YOLO paper (see above), the last leaky ReLu transfer function needs to be replaced with a standard (non-leaky?) ReLu.
The following code downloads the network, modifies the layers, and saves the resulting modified network to the current folder in MATLAB.
Creating the yolo network in MATLAB2.
Run an image through the network and examine the output vectorTo test my implementation of YOLO, I summoned the heights of my visual art abilities and took a snapshot that contained four objects that YOLO has been trained on — a chair, dog, potted plant, and sofa.
Here is my test image:Still Life with Chair, Stella the Dog, Potted Plant, and Sofa.
Because YOLO is a regression network, we will use the function predict rather than classify.
Image pixels need to be scaled to [0,1] and images need to be resized to 448×448 pixels.
If your image is grayscale (i.
single channel) it needs to be expanded to 3 channels.
The following code pre-processes an image (you will need to supply your own image in the MATLAB current folder), applies the regression network to it, and plots the resulting 1×1470 output vector.
In this section of code, we also define a probability threshold for a cell containing an object (0.
2 seems to work well) and an intersection over union threshold for non-max suppression, both of which we will use later.
Running a test image through the networkIt’s worth taking a look at the output vector of the YOLO algorithm.
You can reverse engineer the output by noting that indices 1–980 peak approximately every 20 positions corresponding to the 20 classes of objects YOLO has been trained on, indices 981–1079 contain a few peaks corresponding to the probabilities of handful of recognizable objects (4) in my test image, and indices 1080–1470 are fairly stochastic which is what what you would expect from the bounding box coordinates.
The raw 1×470 output of our modified YOLO network when shown an image of Stella the Dog.
After running dozens of simple images through the network, for example an image containing a single tall person on one side of the image, I was able to decode the output of the YOLO algorithm as follows.
I only guarantee its accuracy for situations in which no lives, livelihoods, personal property, or anything of the remotest value is on the line…YOLO output key.
Wish I could figure out a way of monetizing it.
Reshape the output vector and find cells containing objectsNow that we have a key to the output of the YOLO algorithm, we can begin the process of converting that 1×1470 output vector into bounding boxes overlaid onto our test image.
As a first step, I decided to convert the 1×1470 vector into a 7x7x30 array where the first two dimensions correspond to the 7×7 cells that YOLO divides the image into, and the last dimension contains the 30 numbers defining probabilities and bounding boxes for each cell.
After we do this, we can make a simple plot showing cells that are likely to contain objects.
Comparing the plot below to the original test image, we see that the algorithm appears to be detecting the chair, dog, and sofa, but seems to be missing the potted plant.
Reshaping the output vector into a 7x7x30 arrayYellow cells on the left contain an object in the image of Stella the Dog on the right.
So far so (kind of) good.
Plot Bounding Boxes and Class labelsNow that we have a nice 7x7x30 output array, we can plot bounding boxes and class labels over the original image.
We will now going to make use of the probThresh variable from earlier in the code.
You made need to tweak this value for optimal results, but I’ve found that 0.
2 seems to work well for a variety of images.
I won’t say much about the following code except that it’s a bit tedious and makes my head hurt.
Displaying bounding boxesToo many sofas, not enough potted plants.
Non-max suppressionAs you can see above, there are two immediate problems with our test image: 1) The sofa is marked twice, and 2) the potted plant is not marked.
The chair and Stella the Dog are classified perfectly.
We will correct the first problem (too many sofas) using non-max suppression and the second problem we will ignore.
In reality, the object in the upper left isn’t technically a potted plant, it is a few evergreen boughs in a vase of water which I call a Manhattan Christmas Tree.
I’m not one to argue semantics with an algorithm.
Non-max suppression will remove a bounding box A if:It’s intersection over union (IOU) with a bounding box B is above a defined thresholdBounding boxes A and B contain the same class of objectBounding box A has a lower probability of containing an object than box BHere is the code:Voila!.Viola!.(is something I imagine luthiers say a lot)And that’s it!.Real-time unified object detection in MATLAB.
Does it work on sheep?.Of course!.Does it work on people and dogs and sheep?.Nope…See here for a brief discussion of the fascinating, timely, and ever evolving YOLO/AI sheep conundrum:YOLO is Sheep Obsessed: Environmental Context in Unified Object DetectionI recently adapted Joseph Redmond’s YOLO algorithm for MATLAB (so far without non-max suppression) and spent a fun hour…towardsdatascience.