Autoencoders: Neural Networks for Unsupervised Learning

Recall that the label of the decoder is now the label of this large neural network, and the label of the decoder was our original input data.

Therefore, the label for our large neural network is exactly the same as the original input data to this large neural network!For us to apply our neural networks and whatever we’ve learnt in Part 1a, we need to have a loss function that tells us how we are doing.

We then find the best parameters that minimizes the loss function.

This much has not changed.

In other tasks, the loss function comes from how far away our output neuron is from the ground truth value.

In this task, the loss function comes from how far away our output neuron is from our input neuron!Given that the task is to encode and reconstruct, this makes intuitive sense.

If the output neurons match the original data points perfectly, this means that we have successfully reconstructed the input.

Since the neural network has a bottleneck layer, it must then mean that the fewer set of features in the encoding contains all the data it needs, which means we have a perfect encoder.

This is the gold standard.

Now, the auto-encoder may not be perfect, but the closer we can get to this gold standard, the better.

In essence, training an auto-encoder means:Training a neural network with a ‘bottleneck layer’ within our neural network.

The bottleneck layer has less features than the input layer.

Everything to the left of the bottleneck layer is the encoder; everything to the right is the decoder.

The label that we compare our output against is the input to the neural network.

Since we now have a label, we can apply our standard neural network training that we’ve learnt in Part 1a and Part 1b as though this was a Supervised Learning task.

And there we have it, our auto-encoder!Summary: An auto-encoder uses a neural network for dimensionality reduction.

This neural network has a bottleneck layer, which corresponds to the compressed vector.

When we train this neural network, the ‘label’ of our output is our original input.

Thus, the loss function we minimize corresponds to how poorly the original data is reconstructed from the compressed vector.

Consolidated Summary: Unsupervised Learning deals with data without labels.

An example of Unsupervised Learning is dimensionality reduction, where we condense the data into fewer features while retaining as much information as possible.

An auto-encoder uses a neural network for dimensionality reduction.

This neural network has a bottleneck layer, which corresponds to the compressed vector.

When we train this neural network, the ‘label’ of our output is our original input.

Thus, the loss function we minimize corresponds to how poorly the original data is reconstructed from the compressed vector.

What’s Next: We’ve gone through a brief overview on the vanilla auto-encoder, which is useful for dimensionality reduction, i.

e.

encoding data into a more compressed representation with less features.

This is useful for applications such as visualizing on fewer axes or reducing the feature set size before passing it through another neural network.

In a future post, we will go through a popular and more advanced variant of the auto-encoder, called variational autoencoders (VAEs).

VAEs are used for applications such as image generation.

A simple VAE, for example, is able to generate the faces of fictional celebrities like this:Plain VAE trained on a dataset of celebrities to synthesize photos of fictional celebrities.

Image taken from https://github.

com/yzwxx/vae-celebAYou’ll need the concepts in this post as a pre-requisite, so don’t forget what you’ve learnt here today!.Till next time!.

. More details

Leave a Reply