Perceptron Layers
Perceptron refers to the simplest type of neural network, which consists of only one neuron. To handle more complex problems, a model known as a multilayer perceptron (MLP) is used. A multilayer perceptron contains one or more hidden layers that allow the network to learn intricate patterns in data.
The structure of a multilayer perceptron includes:
- Input layer: receives the input data;
- Hidden layers: process the data and extract meaningful patterns;
- Output layer: produces the final prediction or classification.
Each layer is composed of multiple neurons, and the output of one layer serves as the input for the next layer.
Layer Weights and Biases
Before implementing a layer, it is important to understand how to store the weights and biases of each neuron within it. In the previous chapter, you learned how to store the weights of a single neuron as a vector and its bias as a scalar (single number).
Since a layer consists of multiple neurons, it is natural to represent the weights as a matrix, where each row corresponds to the weights of a specific neuron. Consequently, biases can be represented as a vector, whose length is equal to the number of neurons.
Given a layer with 3 inputs and 2 neurons, its weights will be stored in a 2Γ3 matrix W and its biases will be stored in a 2Γ1 vector b, which look as follows:
W=[W11βW21ββW12βW22ββW13βW23ββ]b=[b1βb2ββ]Here, element Wijβ represents the weight of a j-th input to the i-th neuron, so the first row contains the weights of the first neuron, and the second row contains the weights of the second neuron. Element biβ represents the bias of the i-th neuron (two neurons β two biases).
Forward Propagation
Performing forward propagation for each layer means activating each of its neurons by computing the weighted sum of the inputs, adding the bias, and applying the activation function.
Previously, for a single neuron, you implemented weighted sum of the inputs by computing a dot product between the input vector and the weight vector and adding the bias.
Since each row of the weight matrix contains the weight vector for a particular neuron, all you have to do now is simply perform a dot product between each row of the matrix and the input vector. Luckily, this is exactly what matrix multiplication does:
To add the biases to the outputs of the respective neurons, a vector of biases should be added as well:
Finally, the activation function is applied to the result β sigmoid or ReLU, in our case. The resulting formula for forward propagation in the layer is as follows:
a=activation(Wx+b)where a is the vector of neuron activations (outputs).
Layer Class
The perceptron's fundamental building blocks are its layers, therefore, it makes sense to create a separate Layer class. Its attributes include:
inputs: a vector of inputs (n_inputsis the number of inputs);outputs: a vector of raw output values (before applying the activation function) of the neurons (n_neuronsis the number of neurons);weights: a weight matrix;biases: a bias vector;activation_function: the activation function used in the layer.
Like in the single neuron implementation, weights and biases will be initialized with random values between -1 and 1 drawn from a uniform distribution.
class Layer:
def __init__(self, n_inputs, n_neurons, activation_function):
self.inputs = np.zeros((n_inputs, 1))
self.outputs = np.zeros((n_neurons, 1))
self.weights = ...
self.biases = ...
self.activation = activation_function
The inputs and outputs attributes will be used later in backpropagation, so it makes sense to initialize it as NumPy arrays of zeros.
Initializing inputs and outputs as zero-filled NumPy arrays prevents errors when performing calculations in forward and backward propagation. It also ensures consistency across layers, allowing smooth matrix operations without requiring additional checks.
Forward propagation can be implemented in the forward() method, where outputs are computed based on the inputs vector using NumPy, following the formula above:
def forward(self, inputs):
self.inputs = np.array(inputs).reshape(-1, 1)
# Raw outputs
self.outputs = ...
# Applying the activation function
return ...
Reshaping inputs into a column vector ensures correct matrix multiplication with the weight matrix during forward propagation. This prevents shape mismatches and allows seamless computations across all layers.
1. What makes a multilayer perceptron (MLP) more powerful than a simple perceptron?
2. Why is it necessary to apply this code before multiplying inputs by the weight matrix?
Thanks for your feedback!
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat
Awesome!
Completion rate improved to 4
Perceptron Layers
Swipe to show menu
Perceptron refers to the simplest type of neural network, which consists of only one neuron. To handle more complex problems, a model known as a multilayer perceptron (MLP) is used. A multilayer perceptron contains one or more hidden layers that allow the network to learn intricate patterns in data.
The structure of a multilayer perceptron includes:
- Input layer: receives the input data;
- Hidden layers: process the data and extract meaningful patterns;
- Output layer: produces the final prediction or classification.
Each layer is composed of multiple neurons, and the output of one layer serves as the input for the next layer.
Layer Weights and Biases
Before implementing a layer, it is important to understand how to store the weights and biases of each neuron within it. In the previous chapter, you learned how to store the weights of a single neuron as a vector and its bias as a scalar (single number).
Since a layer consists of multiple neurons, it is natural to represent the weights as a matrix, where each row corresponds to the weights of a specific neuron. Consequently, biases can be represented as a vector, whose length is equal to the number of neurons.
Given a layer with 3 inputs and 2 neurons, its weights will be stored in a 2Γ3 matrix W and its biases will be stored in a 2Γ1 vector b, which look as follows:
W=[W11βW21ββW12βW22ββW13βW23ββ]b=[b1βb2ββ]Here, element Wijβ represents the weight of a j-th input to the i-th neuron, so the first row contains the weights of the first neuron, and the second row contains the weights of the second neuron. Element biβ represents the bias of the i-th neuron (two neurons β two biases).
Forward Propagation
Performing forward propagation for each layer means activating each of its neurons by computing the weighted sum of the inputs, adding the bias, and applying the activation function.
Previously, for a single neuron, you implemented weighted sum of the inputs by computing a dot product between the input vector and the weight vector and adding the bias.
Since each row of the weight matrix contains the weight vector for a particular neuron, all you have to do now is simply perform a dot product between each row of the matrix and the input vector. Luckily, this is exactly what matrix multiplication does:
To add the biases to the outputs of the respective neurons, a vector of biases should be added as well:
Finally, the activation function is applied to the result β sigmoid or ReLU, in our case. The resulting formula for forward propagation in the layer is as follows:
a=activation(Wx+b)where a is the vector of neuron activations (outputs).
Layer Class
The perceptron's fundamental building blocks are its layers, therefore, it makes sense to create a separate Layer class. Its attributes include:
inputs: a vector of inputs (n_inputsis the number of inputs);outputs: a vector of raw output values (before applying the activation function) of the neurons (n_neuronsis the number of neurons);weights: a weight matrix;biases: a bias vector;activation_function: the activation function used in the layer.
Like in the single neuron implementation, weights and biases will be initialized with random values between -1 and 1 drawn from a uniform distribution.
class Layer:
def __init__(self, n_inputs, n_neurons, activation_function):
self.inputs = np.zeros((n_inputs, 1))
self.outputs = np.zeros((n_neurons, 1))
self.weights = ...
self.biases = ...
self.activation = activation_function
The inputs and outputs attributes will be used later in backpropagation, so it makes sense to initialize it as NumPy arrays of zeros.
Initializing inputs and outputs as zero-filled NumPy arrays prevents errors when performing calculations in forward and backward propagation. It also ensures consistency across layers, allowing smooth matrix operations without requiring additional checks.
Forward propagation can be implemented in the forward() method, where outputs are computed based on the inputs vector using NumPy, following the formula above:
def forward(self, inputs):
self.inputs = np.array(inputs).reshape(-1, 1)
# Raw outputs
self.outputs = ...
# Applying the activation function
return ...
Reshaping inputs into a column vector ensures correct matrix multiplication with the weight matrix during forward propagation. This prevents shape mismatches and allows seamless computations across all layers.
1. What makes a multilayer perceptron (MLP) more powerful than a simple perceptron?
2. Why is it necessary to apply this code before multiplying inputs by the weight matrix?
Thanks for your feedback!