Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Data Preprocessing | Neural Network from Scratch
Introduction to Neural Networks

Data PreprocessingData Preprocessing

Wine Dataset

Now we will try to train our model on more realistic task. There is a wine dataset in scikit-learn library that we will use to predict wine class. We will use 3 input parameters for prediction.

Here you can see how this dataset look like:

To train our model, we'll use three input parameters: flavanoids, proline, and total_phenols. For now, we have chosen these parameters as one of those with the highest correlation. This is done in order to reduce the size of the neural network required for successful training and reduce the time spent on the training process.

Data Preprocessing

Here's how we'll prepare the data for training:

  1. Data Scaling: Neural networks differ from decision trees or random forests in that they require data scaling for better performance. This step is crucial for reasons such as ensuring numerical stability, achieving faster convergence, and ensuring unit independence, etc. Always scale your data before passing it trough a neural network;
  2. One-Hot Encoding: Our target values comprise three classes, represented in a single column by the numbers 0, 1, and 2. For enhanced neural network performance, it's more effective to encode these classes into three distinct columns;
  3. Train-Test Data Split: Using the same dataset for both training and testing won't give us a realistic measure of the model's performance on new, unseen data.
If you feel you've overlooked aspects of data preparation, consider enrolling in the ML Introduction with scikit-learn course to enhance your understanding of this subject.

Завдання

Prepare the wine dataset to work with our neural network:

  1. Extract input values from the dataset.
  2. Scale input values.
  3. Split data into train and test sets (40% of data will be used as test data).

Все було зрозуміло?

Секція 2. Розділ 5
toggle bottom row
course content

Зміст курсу

Introduction to Neural Networks

Data PreprocessingData Preprocessing

Wine Dataset

Now we will try to train our model on more realistic task. There is a wine dataset in scikit-learn library that we will use to predict wine class. We will use 3 input parameters for prediction.

Here you can see how this dataset look like:

To train our model, we'll use three input parameters: flavanoids, proline, and total_phenols. For now, we have chosen these parameters as one of those with the highest correlation. This is done in order to reduce the size of the neural network required for successful training and reduce the time spent on the training process.

Data Preprocessing

Here's how we'll prepare the data for training:

  1. Data Scaling: Neural networks differ from decision trees or random forests in that they require data scaling for better performance. This step is crucial for reasons such as ensuring numerical stability, achieving faster convergence, and ensuring unit independence, etc. Always scale your data before passing it trough a neural network;
  2. One-Hot Encoding: Our target values comprise three classes, represented in a single column by the numbers 0, 1, and 2. For enhanced neural network performance, it's more effective to encode these classes into three distinct columns;
  3. Train-Test Data Split: Using the same dataset for both training and testing won't give us a realistic measure of the model's performance on new, unseen data.
If you feel you've overlooked aspects of data preparation, consider enrolling in the ML Introduction with scikit-learn course to enhance your understanding of this subject.

Завдання

Prepare the wine dataset to work with our neural network:

  1. Extract input values from the dataset.
  2. Scale input values.
  3. Split data into train and test sets (40% of data will be used as test data).

Все було зрозуміло?

Секція 2. Розділ 5
toggle bottom row
some-alt