Artificial Neural Networks are computational models inspired by human and animal brains.
Biological Neuron
A fundamental collection and processing unit of the human brain is a neuron (of which there are distinct types). At a high level, the key parts of a neuron are dendrites which act as receivers and relay signals to the nucleus within the cell body (soma). The axon, which conducts an action potential (an impulse) away from the cell body, and a synaptic region of interconnection with other neurons.
At the synaptic junction, the axon terminal of the presynaptic neuron is separated from the postsynaptic neuron (dendtite/soma/axon membrane) by a synaptic gap. An action potential in the presynaptic neuron activates voltage gated channels to release excitatory or inhibitory neurotransmitters which bind to the postsynaptic neuron. Each neuron in the human brain is connected via synapses to up to 10,000 other neurons (Zhang, 2019).
McCulloch-Pitts Neuron
McCulloch and Pitts (1943) generated a logical computational model inspired by the “all or none” behaviour of neural events. The linear McCulloch-Pitts (MP) neuron is shown below, and represented in two parts ‘g’ and ‘f’. g(x) acts to sum the inputs, f(g(x)) acts a gate on g, whereby the output y, is only set to 1 if g(x) meets or exceeds a threshold. Not y is set to 0 (i.e. no action potential activation) if any single input is inhibitory, or if g(x) (the summation of all x inputs) is below a threshold.
Perceptron
Rosenblatt (1957, 1958, 1962) proposed the Perceptron, which extended on the McCulloch and Pitts model, but which was capable of learning weights applied to inputs, by minimising an applied error metric. The model is capable of defining a linear boundary to two linearly separable classes.
Rosenblatt developed the Mark I Perceptron as a a visual pattern classifier, inspired by retina sensory unit processing. A grid of 20 x 20 semiconductor photodiodes (‘S units’) which sent on or off input signals in response to a photostimulus formed the sense layer. An association layer of 512 ‘A units’ (comprising stepping motors), whereby if the sum of excitatory or inhibitory input impulses is equal to or greater than a threshold, the A-unit is fired. An response (‘effector’ i.e output) layer consisted of 8 binary units (‘R units’).
Rosenblatt (1962) proposed several methods for error-correction (i.e. training) using variable S-A connections were implemented. Back propagation of error to adjust weights was implemented using Perceptrons attempting to learn horizontal vs vertical bar discrimination.
Multi-layer perceptrons
Multi-layer perceptrons were proposed by Rosenblatt (1962) incorporating more than one A-unit layer.
After demonstrating that “series coupled perceptrons are capable of learning any type of classification, or associating any responses to stimuli or to sequences of stimuli, that might possibly be required” Rosenblatt (1962) explains that multi-layer perceptrons have greater adaptability, indeed MLPs are universal approximators. Notably generalizability to novel stimuli is ‘strikingly improved’ over three-layer series coupled perceptrons.
You can find a simple MLP implementation in Pytorch here.