Writing and training a simple perceptron to find outAn brief practical introduction to the simple perceptron learning algorithm and using it to classify bioconcentrationGurkamal DeolBlockedUnblockFollowFollowingJun 5Image by Ahmed Gad from PixabayBriefly, what is a perceptron?A neuron is the basic functioning unit of the brain, similarly a perceptron is the basic functioning unit of a neural network.

In this post I’ll briefly cover the similarities between artificial and biological neural networks, the theory behind how perceptrons work, and lastly how to implement the algorithm in python to train it on a bioconcentration data set.

In animals, a neuron receives input from the synapses of other neurons at its dendrites.

These tree like structures take the input signals and amalgamate them in the cell body, also known as a soma.

Once the summation of signals happens in the soma, gated ion channels will open or remain closed depending on whether the signal breaches a threshold value — causing the neuron to fire along the axon or remain static.

The neuron either fires or it doesn’t.

Biological and artificial neuron similaritiesThere are a few components in the image I put together above that we should go over to understand the model better:Input & bias: The dendrite of the biological neuron accepts input as neurotransmitters from connecting synapses.

The counterpart in the perceptron model is the input (a feature used for the classification) multiplied by it’s respective weight.

The weights are values which change over time when training the model as they update in the “learning” phase when an error occurs during training.

A bias is added as a special input to shift the decision boundary by translating points left or right.

The summation equation below shows how the inputs, weights, and bias fit together.

Summation equationIn the image below is a sigmoid curve and if we were to change the weights we could alter the steepness of the slope but to actually shift the curve left or right you would need to add a bias.

Translating all the points in a certain direction using bias can increase accuracy by helping separate the hyperplane.

Basic sigmoid curveActivation function: The summation of excitatory and inhibiting ions in the soma of a neuron results in an action potential.

If the action potential is excitatory and breaches the threshold value — a signal is fired.

In an artificial neuron the activation function calculates the net output of the summed inputs.

The perceptron is effectively the model’s decision maker and uses the heaviside function, which is also known as a step function, to calculate a predicted binary output.

Below is the step function that is most commonly used:Unit step functionθ is the activation functionz is the sum of the inputs multiplied by their weights (and bias if included)Output: The biological neuron propagates a signal down it’s axon if the threshold is reached, this is it’s output.

A perceptron’s output too fires on an all or nothing basis and results in a binary classification of either 1 or -1.

Note: A more in depth article on this material can be found here.

Step by step algorithmThe following is a rundown of the steps taken by the algorithm to predict and then learn.

Set weights to small initial valuesMultiply the input and weight vectors then sum them upIf the the summed value is greater than the threshold then a binary output will be computedCheck to see if the predicted outcome was correct and then update weights accordinglyRepeat the process for increased accuracyNote: Another great article explaining why the algorithm works.

Coding the perceptronNote: I’ll be dissecting and explaining code examples from the book “Python Machine Learning” while mixing in code of my own.

I’ll be going over each line of code for our perceptron.

The very first line of code will import numpy since we need to perform vector multiplication and draw on random numbers.

We then create the perceptron class and initialize it and set parameter values for “epochs”, “learning_rate”, and randomState.

epochs— the number of times all training data is passed forward and backwardslearning rate — usually referred to as η (eta) and step size.

This value updates the weights.

When training the data, the weights will be updated according to how much error they're responsible for, however the learning rate updates the weights with a fraction of this error.

So weights are updated as such — weight + η(error).

randomState— is used a class for drawing pseudo random generated numbers for an instance.

I’d advise against using random.

seed() since it will impact the global numpy environment.

Here we define the fit function which takes x and y arguments.

weights— this parameter is used to set the weights to a random number instead of zero.

Having zero as the starting weights causes a problem of symmetry.

If the weight of all input values is zero then the error will also be zero and they will all be updated with the same magnitude.

We generate a normal distribution with a mean of 0, a standard deviation of 0.

01 and a draw size of 1 sample.

errors — is an empty list that we will append with errors we catch during trainingThe next chunk of code is a loop that falls within the fit function that we just defined.

However, we’ll look at it separately line-by-line due to its many parts.

we set the loop to iterate through each epochset the error variable to 0 for each iterationhere xi and target are two numbers in a tuple of x and y values that we input as our datawe set the update variable as the value we need to update our weights with, which is learning rate * the errorthe weights of the inputs are updated with the following formula:weights = weights + (update * xi)here we update the bias input as: weight = weight + updatewe now set the value of the errors variable as the update valuelastly we append the list of errors we created earlierThe last block of the perceptron code will define the summation and prediction functions.

summation — we define our feature inputs as x and return the vector dot product of the weights and the inputs, along with the bias unit.

predict — using x (feature input) as the function argument, the function returns the 1 if summation(x) is greater than 0; -1 otherwise.

Cleaning and preparing the dataI chose not to go with the classic data sets like Fisher’s iris data set and rather chose to find one relevant to the work I’m currently doing.

This led me to a bioconcentration data set hosted by the University of Milano-Bicocca and can be found here.

The data only needs a bit of touching up and is otherwise ready to use.

In the following block of code is what I did to make the data usable with the perceptron model.

The original data comes with 3 different classifications: (1) is mainly stored within lipid tissues, (2) has additional storage sites (e.

g.

proteins), or (3) is metabolized/eliminated [1].

Since our model will work with two prediction classes we can drop the (2) class label.

We can now create our training data set to train the perceptron learning model:Training the perceptronBelow is the complete block of code for our perceptron learning modelThis data set was chosen on a whim, and for a classifier to be as accurate as possible, the predictive features should be able to separate the groups along a plane.

We could perform a PCA or another form of dimension reduction to figure out the most important features but that isn't the focus of this tutorial.

The point is to see a if we can write and train a perceptron on a given set of classification data and how it performs on that data.

The code block below plots the results for our training and looks at the number of updates over epochs — how much adjustment we have to perform over passes on the entire training data set.

Conclusion: Not great, but it works!Number of updates over epoch (entire data set passes)We see that although the number of updates oscillates, there is a general downward trend over iterations, that is to say, our perceptron algorithm is getting more accurate with each pass over the training data.

We can’t say that it will reach zero (as it would with a less complex data set such as Fisher’s Iris) but we can say that our simple perceptron learning algorithm — from the 1960’s — did indeed “learn”.

References[1] F.

Grisoni, V.

Consonni, M.

Vighi, S.

Villa, R.

Todeschini, Investigating the mechanisms of bioconcentration through QSAR classification trees (2016), Environment International.. More details