The hidden layer size will be 25 nodes and the output will be 4 nodes (4 type of signs).Defining the hidden layer size has no strict formula but usually it depends on the question “How well does it fits the data?”Image 3: Andrew Ng on Neural Network SizeHere we are going to use the main.m file and we will:Load the features and labelsRandomly initialize Theta values (NN nodes weights)Create the cost function and forward propagationCreate the gradient for the NN cost function (Backpropagation)Minimize the cost function using fmincg minimizerLoad the features and labelsSo let’s start step one which is loading the features and the labels..We do that by using dlmread function….X = dlmread('x_features_train');% Labels for each processed training image%[1 0 0 0] – left, [0 1 0 0] – right, [0 0 1 0] – palm, [0 0 0 1] – peacey = dlmread('y_labels_train');…Randomly initialize Theta values (NN nodes weights)Next we need to initialize the Theta values using randInitializeWeights.m function..Which is represented by the following codeepsilon = sqrt(6) / (L_in + L_out);W = zeros(L_out, 1 + L_in);W = (rand(L_out, 1 + L_in) * 2 * epsilon) – epsilon;Where the generated values are between [-epsilon, epsilon]..This code is related with statistics formula for “Uniform Distribution Variance”..If you are more interested about this formula I will leave links at the end of this blog or you can post a question.Create the cost function and forward propagationOur next goal is to implement the Cost Function defined by the equation below.Image 4: Regularized Cost FunctionWhere g is the activation function (Sigmoid function in this case)In order co compute the cost we need to use Feedforward computation..The code is implemented in nnCostFunction.m..We will use a for-loop over the examples to compute the cost also we need to add the column of 1’s to the X matrix which represent the “bias” values..The θ₁ (Theta 1) and θ₂ (Theta 2) values are parameters for each unit in the NN, the first row of θ₁ corresponds to the first hidden unit in the second layer.Create the gradient for the NN cost function (Backpropagation)To be able to minimize the cost function we need to compute the gradient for the NN cost function..For that we are going to use the Backpropagation algorithm, short for “backward propagation of errors”, is used for minimizing our cost function which means minimizing error for our NN and minimizing the error for each output neuron..This calculation is part of the code implemented in nnCostFunction.mMinimize the cost function using fmincg minimizerOne we computed the gradient, we can train the neural network by minimizing the cost function J(Θ) using an advanced optimizer such as fmincg..This function is not a part of the Octave so I got it from Machine Learning Course by Andrew Ng..As fur as I know this function is faster than the ones implemented in Octave and it uses Conjugate gradient method.fmincg takes 3 arguments as shown in the code example below.. More details