In this post we will focus on how the perecptron works. We are trying to get a discrete answer from a large amounts of inputs. The percepetron takes this large amount of inputs,weighs them and then adds them together (a dot product). It then takes this value and forces it to be a discrete “yes” or “no” or positve and negative. This is done through by placing the value through the activation function. The original weights are then updated to try and obtain a more accurate prediction the next time around.
The basic way this is done is as follows, and will be discussed more fully when we discuss the math behind the perceptron. The components of the updating function are the learning rate, the predicted class label, the correct class label and the inputs. We take the difference between the predicted class label and the actual class label, modify it by how quickly we want the weights to change (the learning rate) and then modify the inputs using it. We then have updated weights for the next iteration.
This will then continue until the algorithm classifies all examples correctly. Obviously this can only occur if the data set is linearly separable. If it’s not, we can just have it run for a predetermined number of iterations (epochs) or have some threshold for the maximum amount of tolerated misclassifications.
We will build on this concept in future posts, but that is the idea of how a basic artificial neuron works. Here’s a diagram (from Sebastian Raschka’s excellent Python Machine Learning pg. 24) that puts everything together.
We can see that the inputs are combined with the weights, are then passed through an activation function and then used to update its own weights.