9 Applications of Deep Learning for Computer Vision

Let me know in the comments below.

Image classification involves assigning a label to an entire image or photograph.

This problem is also referred to as “object classification” and perhaps more generally as “image recognition,” although this latter task may apply to a much broader set of tasks related to classifying the content of images.

Some examples of image classification include:A popular example of image classification used as a benchmark problem is the MNIST dataset.

Example of Handwritten Digits From the MNIST DatasetA popular real-world version of classifying photos of digits is The Street View House Numbers (SVHN) dataset.

For state-of-the-art results and relevant papers on these and other image classification tasks, see:There are many image classification tasks that involve photographs of objects.

Two popular examples include the CIFAR-10 and CIFAR-100 datasets that have photographs to be classified into 10 and 100 classes respectively.

Example of Photographs of Objects From the CIFAR-10 DatasetThe Large Scale Visual Recognition Challenge (ILSVRC) is an annual competition in which teams compete for the best performance on a range of computer vision tasks on data drawn from the ImageNet database.

Many important advancements in image classification have come from papers published on or about tasks from this challenge, most notably early papers on the image classification task.

For example:Image classification with localization involves assigning a class label to an image and showing the location of the object in the image by a bounding box (drawing a box around the object).

This is a more challenging version of image classification.

Some examples of image classification with localization include:A classical dataset for image classification with localization is the PASCAL Visual Object Classes datasets, or PASCAL VOC for short (e.

g.

VOC 2012).

These are datasets used in computer vision challenges over many years.

Example of Image Classification With Localization of a Dog from VOC 2012The task may involve adding bounding boxes around multiple examples of the same object in the image.

As such, this task may sometimes be referred to as “object detection.

”Example of Image Classification With Localization of Multiple Chairs From VOC 2012The ILSVRC2016 Dataset for image classification with localization is a popular dataset comprised of 150,000 photographs with 1,000 categories of objects.

Some examples of papers on image classification with localization include:Object detection is the task of image classification with localization, although an image may contain multiple objects that require localization and classification.

This is a more challenging task than simple image classification or image classification with localization, as often there are multiple objects in the image of different types.

Often, techniques developed for image classification with localization are used and demonstrated for object detection.

Some examples of object detection include:The PASCAL Visual Object Classes datasets, or PASCAL VOC for short (e.

g.

VOC 2012), is a common dataset for object detection.

Another dataset for multiple computer vision tasks is Microsoft’s Common Objects in Context Dataset, often referred to as MS COCO.

Example of Object Detection With Faster R-CNN on the MS COCO DatasetSome examples of papers on object detection include:Object segmentation, or semantic segmentation, is the task of object detection where a line is drawn around each object detected in the image.

Image segmentation is a more general problem of spitting an image into segments.

Object detection is also sometimes referred to as object segmentation.

Unlike object detection that involves using a bounding box to identify objects, object segmentation identifies the specific pixels in the image that belong to the object.

It is like a fine-grained localization.

More generally, “image segmentation” might refer to segmenting all pixels in an image into different categories of object.

Again, the VOC 2012 and MS COCO datasets can be used for object segmentation.

Example of Object Segmentation on the COCO DatasetTaken from “Mask R-CNN”.

The KITTI Vision Benchmark Suite is another object segmentation dataset that is popular, providing images of streets intended for training models for autonomous vehicles.

Some example papers on object segmentation include:Style transfer or neural style transfer is the task of learning style from one or more images and applying that style to a new image.

This task can be thought of as a type of photo filter or transform that may not have an objective evaluation.

Examples include applying the style of specific famous artworks (e.

g.

by Pablo Picasso or Vincent van Gogh) to new photographs.

Datasets often involve using famous artworks that are in the public domain and photographs from standard computer vision datasets.

Example of Neural Style Transfer From Famous Artworks to a PhotographTaken from “A Neural Algorithm of Artistic Style”Some papers include:Image colorization or neural colorization involves converting a grayscale image to a full color image.

This task can be thought of as a type of photo filter or transform that may not have an objective evaluation.

Examples include colorizing old black and white photographs and movies.

Datasets often involve using existing photo datasets and creating grayscale versions of photos that models must learn to colorize.

Examples of Photo ColorizationTaken from “Colorful Image Colorization”Some papers include:Image reconstruction and image inpainting is the task of filling in missing or corrupt parts of an image.

This task can be thought of as a type of photo filter or transform that may not have an objective evaluation.

Examples include reconstructing old, damaged black and white photographs and movies (e.

g.

photo restoration).

Datasets often involve using existing photo datasets and creating corrupted versions of photos that models must learn to repair.

Example of Photo Inpainting.

Taken from “Image Inpainting for Irregular Holes Using Partial Convolutions”Some papers include:Image super-resolution is the task of generating a new version of an image with a higher resolution and detail than the original image.

Often models developed for image super-resolution can be used for image restoration and inpainting as they solve related problems.

Datasets often involve using existing photo datasets and creating down-scaled versions of photos for which models must learn to create super-resolution versions.

Example of the Results From Different Super-Resolution Techniques.

Taken from “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network”Some papers include:Image synthesis is the task of generating targeted modifications of existing images or entirely new images.

This is a very broad area that is rapidly advancing.

It may include small modifications of image and video (e.

g.

image-to-image translations), such as:Example of Styling Zebras and Horses.

Taken from “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”It may also include generating entirely new images, such as:Example of Generated Bathrooms.

Taken from “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”Some papers include:There are other important and interesting problems that I did not cover because they are not purely computer vision tasks.

Notable examples image to text and text to image:Presumably, one learns to map between other modalities and images, such as audio.

This section provides more resources on the topic if you are looking to go deeper.

In this post, you discovered nine applications of deep learning to computer vision tasks.

Was your favorite example of deep learning for computer vision missed?.Let me know in the comments.

Do you have any questions?.Ask your questions in the comments below and I will do my best to answer.

.

. More details

Leave a Reply