Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Perceptron Layers | Neural Network from Scratch
Quizzes & Challenges
Quizzes
Challenges
/
Introduction to Neural Networks with Python

bookPerceptron Layers

Perceptron is the simplest neural networkβ€”only one neuron. For more complex tasks, we use a multilayer perceptron (MLP), which contains one or more hidden layers that allow the network to learn richer patterns.

An MLP consists of:

  1. Input layer β€” receives data;
  2. Hidden layers β€” extract patterns;
  3. Output layer β€” produces predictions.

Each layer has multiple neurons; the output of one layer becomes the input of the next.

Layer Weights and Biases

Previously, a neuron stored its weights as a vector and bias as a scalar. A layer, however, contains many neurons, so its weights become a matrix, where each row stores the weights of one neuron. Biases for all neurons form a vector.

For a layer with 3 inputs and 2 neurons:

W=[W11W12W13W21W22W23],b=[b1b2]W=\begin{bmatrix} W_{11} & W_{12} & W_{13} \\ W_{21} & W_{22} & W_{23} \end{bmatrix}, \qquad b=\begin{bmatrix} b_1 \\ b_2 \end{bmatrix}

Here, WijW_{ij} is the weight from the jj-th input to the ii-th neuron; bib_i is the bias of neuron ii.

Forward Propagation

Forward propagation activates each neuron by computing a weighted sum, adding the bias, and applying the activation function.

Previously, a single neuron used:

[ z = W \cdot x + b ]

Now, since each row of (W) is one neuron's weight vector, performing matrix multiplication between the weight matrix and input vector automatically computes all neurons’ weighted sums at once.

To add the biases to the outputs of the respective neurons, a vector of biases should be added as well:

Finally, the activation function is applied to the result β€” sigmoid or ReLU, in our case. The resulting formula for forward propagation in the layer is as follows:

a=activation(Wx+b)a = activation(Wx + b)

where aa is the vector of neuron activations (outputs).

Layer Class

Since MLPs are built from layers, we define a dedicated Layer class. Its attributes:

  • inputs: input vector (n_inputs elements);
  • outputs: raw neuron outputs (n_neurons elements);
  • weights: weight matrix;
  • biases: bias vector;
  • activation_function: activation used in the layer.

Weights and biases are initialized with random values from a uniform distribution in ([-1, 1]). inputs and outputs are initialized as zero-filled NumPy arrays to ensure consistent shapes for later backpropagation.

class Layer:
    def __init__(self, n_inputs, n_neurons, activation_function):
        self.inputs = np.zeros((n_inputs, 1))
        self.outputs = np.zeros((n_neurons, 1))
        self.weights = ...
        self.biases = ...
        self.activation = activation_function
Note
Note

Initializing inputs and outputs with zeros prevents shape errors and ensures layers remain consistent during both forward and backward passes.

Forward Method

Forward propagation for a layer computes raw outputs and applies the activation:

def forward(self, inputs):
    self.inputs = np.array(inputs).reshape(-1, 1)
    # Raw outputs: weighted sum + bias
    self.outputs = ...
    # Apply activation
    return ...

Reshaping the input into a column vector ensures it multiplies correctly with the weight matrix and matches the expected dimensions throughout the network.

Note
Note

If you'd like, I can also shorten this further, produce a diagram of the layer structure, or generate the full working code for the Layer class.

1. What makes a multilayer perceptron (MLP) more powerful than a simple perceptron?

2. Why is it necessary to apply this code before multiplying inputs by the weight matrix?

question mark

What makes a multilayer perceptron (MLP) more powerful than a simple perceptron?

Select the correct answer

question mark

Why is it necessary to apply this code before multiplying inputs by the weight matrix?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 2. ChapterΒ 3

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Suggested prompts:

Can you explain how the weights and biases are initialized in the Layer class?

What activation functions can I use in the Layer class?

Can you walk me through how forward propagation works step by step?

Awesome!

Completion rate improved to 4

bookPerceptron Layers

Swipe to show menu

Perceptron is the simplest neural networkβ€”only one neuron. For more complex tasks, we use a multilayer perceptron (MLP), which contains one or more hidden layers that allow the network to learn richer patterns.

An MLP consists of:

  1. Input layer β€” receives data;
  2. Hidden layers β€” extract patterns;
  3. Output layer β€” produces predictions.

Each layer has multiple neurons; the output of one layer becomes the input of the next.

Layer Weights and Biases

Previously, a neuron stored its weights as a vector and bias as a scalar. A layer, however, contains many neurons, so its weights become a matrix, where each row stores the weights of one neuron. Biases for all neurons form a vector.

For a layer with 3 inputs and 2 neurons:

W=[W11W12W13W21W22W23],b=[b1b2]W=\begin{bmatrix} W_{11} & W_{12} & W_{13} \\ W_{21} & W_{22} & W_{23} \end{bmatrix}, \qquad b=\begin{bmatrix} b_1 \\ b_2 \end{bmatrix}

Here, WijW_{ij} is the weight from the jj-th input to the ii-th neuron; bib_i is the bias of neuron ii.

Forward Propagation

Forward propagation activates each neuron by computing a weighted sum, adding the bias, and applying the activation function.

Previously, a single neuron used:

[ z = W \cdot x + b ]

Now, since each row of (W) is one neuron's weight vector, performing matrix multiplication between the weight matrix and input vector automatically computes all neurons’ weighted sums at once.

To add the biases to the outputs of the respective neurons, a vector of biases should be added as well:

Finally, the activation function is applied to the result β€” sigmoid or ReLU, in our case. The resulting formula for forward propagation in the layer is as follows:

a=activation(Wx+b)a = activation(Wx + b)

where aa is the vector of neuron activations (outputs).

Layer Class

Since MLPs are built from layers, we define a dedicated Layer class. Its attributes:

  • inputs: input vector (n_inputs elements);
  • outputs: raw neuron outputs (n_neurons elements);
  • weights: weight matrix;
  • biases: bias vector;
  • activation_function: activation used in the layer.

Weights and biases are initialized with random values from a uniform distribution in ([-1, 1]). inputs and outputs are initialized as zero-filled NumPy arrays to ensure consistent shapes for later backpropagation.

class Layer:
    def __init__(self, n_inputs, n_neurons, activation_function):
        self.inputs = np.zeros((n_inputs, 1))
        self.outputs = np.zeros((n_neurons, 1))
        self.weights = ...
        self.biases = ...
        self.activation = activation_function
Note
Note

Initializing inputs and outputs with zeros prevents shape errors and ensures layers remain consistent during both forward and backward passes.

Forward Method

Forward propagation for a layer computes raw outputs and applies the activation:

def forward(self, inputs):
    self.inputs = np.array(inputs).reshape(-1, 1)
    # Raw outputs: weighted sum + bias
    self.outputs = ...
    # Apply activation
    return ...

Reshaping the input into a column vector ensures it multiplies correctly with the weight matrix and matches the expected dimensions throughout the network.

Note
Note

If you'd like, I can also shorten this further, produce a diagram of the layer structure, or generate the full working code for the Layer class.

1. What makes a multilayer perceptron (MLP) more powerful than a simple perceptron?

2. Why is it necessary to apply this code before multiplying inputs by the weight matrix?

question mark

What makes a multilayer perceptron (MLP) more powerful than a simple perceptron?

Select the correct answer

question mark

Why is it necessary to apply this code before multiplying inputs by the weight matrix?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 2. ChapterΒ 3
some-alt