A Guide to Understanding Convolutional Neural Networks (CNNs) using Visualization

That’s right – only those parts of the input image that had a significant contribution to its output class probability are visible.

That, in a nutshell, is what occlusion maps are all about.

  Visualizing the Contribution of Input Features- Saliency Maps Saliency maps are another visualization technique based on gradients.

These maps were introduced in the paper – Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps.

Saliency maps calculate the effect of every pixel on the output of the model.

This involves calculating the gradient of the output with respect to every pixel of the input image.

This tells us how to output category changes with respect to small changes in the input image pixels.

All the positive values of gradients mean that small changes to the pixel value will increase the output value: These gradients, which are of the same shape as the image (gradient is calculated with respect to every pixel), provide us with the intuition of attention.

Let’s see how to generate saliency maps for any image.

First, we will read the input image using the below code segment.

Input Image Now, we will generate the saliency map for the image using the VGG16 model: View the code on Gist.

We see that the model focuses more on the facial part of the dog.

Now, let’s look at the results with guided backpropagation: View the code on Gist.

Guided backpropogation truncates all the negative gradients to 0, which means that only the pixels which have a positive influence on the class probability are updated.

  Class Activation Maps (Gradient Weighted) Class activation maps are also a neural network visualization technique based on the idea of weighing the activation maps according to their gradients or their contribution to the output.

The following excerpt from the Grad-CAM paper gives the gist of the technique: Gradient-weighted Class Activation Mapping (Grad-CAM), uses the gradients of any target concept (say logits for ‘dog’ or even a caption), flowing into the final convolutional layer to produce a coarse localization map highlighting the important regions in the image for predicting the concept.

In essence, we take the feature map of the final convolutional layer and weigh (multiply) every filter with the gradient of the output with respect to the feature map.

Grad-CAM involves the following steps: Take the output feature map of the final convolutional layer.

The shape of this feature map is 14x14x512 for VGG16 Calculate the gradient of the output with respect to the feature maps Apply Global Average Pooling to the gradients Multiply the feature map with corresponding pooled gradients We can see the input image and its corresponding Class Activation Map below: View the code on Gist.

Now let’s generate the Class activation map for the above image.

View the code on Gist.

  Visualizing the Process – Layerwise Output Visualization The starting layers of a CNN generally look for low-level features like edges.

The features change as we go deeper into the model.

Visualizing the output at different layers of the model helps us see what features of the image are highlighted at the respective layer.

This step is particularly important to fine-tune an architecture for our problems.

Why?.Because we can see which layers give what kind of features and then decide which layers we want to use in our model.

For example, visualizing layer outputs can help us compare the performance of different layers in the neural style transfer problem.

Let’s see how we can get the output at different layers of a VGG16 model: View the code on Gist.

The above image shows the different features that are extracted from the image by every layer of VGG16 (except block 5).

We can see that the starting layers correspond to low-level features like edges, whereas the later layers look at features like the roof, exhaust, etc.

of the car.

  End Notes Visualization never ceases to amaze me.

There are multiple ways to understand how a technique works, but visualizing it makes it a whole lot more fun.

Here are a couple of resources you should check out: The process of feature extraction in neural networks is an active research area and has led to the development of awesome tools like Tensorspace and Activation Atlases TensorSpace is also a neural network visualization tool that supports multiple model formats.

It lets you load your model and visualize it interactively.

TensorSpace also has a playground where multiple architectures are available for visualization which you can play around with Let me know if you have any questions or feedback on this article.

I’ll be happy to get into a discussion!.You can also read this article on Analytics Vidhyas Android APP Share this:Click to share on LinkedIn (Opens in new window)Click to share on Facebook (Opens in new window)Click to share on Twitter (Opens in new window)Click to share on Pocket (Opens in new window)Click to share on Reddit (Opens in new window) Related Articles (adsbygoogle = window.

adsbygoogle || []).


. More details

Leave a Reply