Neural Networks: The theoretical understandingAbhishek ShuklaBlockedUnblockFollowFollowingDec 28, 2018This article is for students or professionals who want to learn neural networks and still looking for a perfect place to start learning.
If you want to learn neural networks but you feel it’s rocket science, this article is for you.
PrerequisitesIn this article, we are focusing more on the theory side of neural networks we won’t be talking much about the mathematics behind it.
Basic knowledge of computer science will make it easy to understand.
What is an Artificial Neural Network (or just neural networks)A little bit about the human brain, Our brain has billions of neurons which store information and give us the capacity to think, logic and process a humongous amount of information every day without bothering us much.
These neurons are connected with each other and process and pass the information to each other.
This complex network of biological neurons makes us the (hopefully) smartest species in this planet.
Biological NeuronsIn computer science, an artificial neural network is a system (mostly software) which tries to mimic the thinking style of the human brain.
The neural network is built by connection of many small units known as artificial neurons or just neurons(nodes).
Each neuron in the network holds some weight and bias value, weight gets multiplied with the input value and bias gets added to this product.
We pass our data through network multiple times which is called iteration or epoch, with each epoch, weights, and bias of neuron gets updated to fit the input data better.
This weight and bias update gives the network the capacity of learning.
If you don’t understand how weight and bias value updates and how the network learns, don’t worry just keep reading we will be explaining these things later in this article.
Why we need neural networksNormally in computer science, we have data with variables and we know the relationship of these variables with each other, now we write a program which implements this relationship of variables and our system is ready.
Now each time when we pass the input variables to the program, our program does the computation according to the implemented relationship of variables and always gives the exact answer.
This computation may be easy or hard, depending on the nature of a problem, but the relationship between variables will be always well defined.
Now imagine the case, when we have a large amount of data with so many variables but we don’t know the relationship between these variables.
We can not write a program to get information from this data if the relationship between variables is unknown.
Here neural network clicks in.
To solve such problems, we write a neural network and give our data as input to the network, now neural network tries to find the best approximate relationship between variables in input data.
So neural networks are very powerful tools when the relationship between variables of data is unknown.
The architecture of a simple neural networkIn this section, we will tell you about the basic architecture of the neural network and its different components.
Here is the picture of a simple neural network:Neural Network ArchitectureNeural networks are the stack of connected layers, each layer is built by a group of neurons.
The first layer of the network is called the input layer and all intermediate layers are called the hidden layer, the last layer is called the output layer.
Each layer takes the input from the previous layer, process it and forward the processed data to the next layer, which will become input for the next layer.
As now we are familiar with the architecture of the neural network we will learn about each component of the neural network.
LayersAfter seeing the architecture of the neural network you must be aware of the fact that neural network is built by multiple layers.
A layer is a logical group of neurons of the same level.
Neural network mentioned in the above image has 3 layers, input layer, an output layer, and hidden layers.
Depending on the size of the data and type of data, we change the number of hidden layers in the network.
A complex network may have hundreds of hidden layers.
In simple neural networks, the neurons of i’th layer will receive input data from (i-1)’th layer and after processing it will forward data to neurons of (i+1)’th layer.
In neural network data flows layer by layer, so once the data got processed by all neurons of i’th layer, only then data will be forwarded to (i+1)'th layer.
Neuron (node)Neurons are building blocks of the neural network.
Neurons are a small processing unit which has 3 major components a) weights b) bias c) activation function.
Here is the zoom in view of a neuron of a simple neural network:The basic operation of a neuron can be defined by the following steps:Receive inputMultiply weight with input and add bias value to this productApply the activation function on the above-computed valuepass the activation function response value to next neuron as an inputWhile training the neural network the wights and bias value of neurons keep updating to find the best set of values for each neuron of the network.
How neural network learnsOnce our training data is ready we pass it to the input layer of the neural network.
Now neurons receive this data and apply initialized weights and bias on this data, after that, they apply activation function on the data and pass this data to neurons of the next layer.
This process keeps repeating on each layer of the network and data keep moving towards the output layer.
Once data reaches to output layer then the loss function computes the difference between the expected output and computed output, which is called the training loss of the network.
Once we find the training loss the optimizer clicks in, an optimizer is a function in the neural network which tries to reduce the loss of the network by adjusting the weights and bias values of neurons.
After loss computation optimizer tries to change the weights of the neurons of the hidden layer (which is just before the output layer) in such a way that the loss becomes minimal.
Once this layer’s weights and bias of each neuron is updated it goes to the previous layer of this layer and tries to do the same.
Since the optimizer starts working from the last layer and goes way back till the first layer, this process is known as backpropagation.
Once data passes through the network and backpropagation is done, this whole iteration is called the epoch.
To get a good network we pass this data again and again through the network until backpropagation finds most suitable values of weights and bias for each neuron, this whole process is called training the network.
Now the question comes, how we will know that all neurons got most suitable value, so we monitor this from training loss.
After each epoch, we are getting the value of the loss, if our architecture is fine then loss value will stop changing after few epochs, this state is called convergence of network.
Once the network gets converged we can stop the training process and save the network.
Now our neural network is ready, we can use this saved network to predict the output of similar unseen data.
ConclusionThis was the introduction of neural networks, now you guys are ready to write your own neural network, So I will be writing part-II of this article, in that article, I will explain how to write your own neural network using python.
Subscribe to Tech Buffer publication and you will get the notification once we publish our next article.
Guys, If you loved this article, go ahead and show your love by clicking the clap button below.
If you want me to write about something specific or you have any suggestions about this article, please leave your comment below and let me know.
Click the follow button on my Medium profile and get the notification whenever I publish new articles.
Happy Reading…Jai Hind!.