Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Data Preprocessing | Neural Network from Scratch
Introduction to Neural Networks
course content

Зміст курсу

Introduction to Neural Networks

Introduction to Neural Networks

1. Concept of Neural Network
2. Neural Network from Scratch
3. Conclusion

book
Data Preprocessing

Wine Dataset

Now we will try to train our model on more realistic task. There is a wine dataset in scikit-learn library that we will use to predict wine class. We will use 3 input parameters for prediction.

Here you can see how this dataset look like:

12345678910
import pandas as pd # Import pandas to create a DataFrame from loaded dataset from sklearn.datasets import load_wine # Import dataset loading function wine_ds = load_wine() # Load the dataset X = pd.DataFrame(wine_ds.data, columns=wine_ds.feature_names)[['flavanoids', 'proline', 'total_phenols']] # Extract input values from the dataset y = pd.DataFrame(wine_ds.target, columns=['target']) # Extract output values from the dataset # Display the datasets display(X.head()) # `X` is our input values, they are used to predict target value display(pd.DataFrame(y.value_counts())) # `y` is a target value, that we want to predict; it has 3 target classes
copy

To train our model, we'll use three input parameters: flavanoids, proline, and total_phenols. For now, we have chosen these parameters as one of those with the highest correlation. This is done in order to reduce the size of the neural network required for successful training and reduce the time spent on the training process.

Data Preprocessing

Here's how we'll prepare the data for training:

  1. Data Scaling: Neural networks differ from decision trees or random forests in that they require data scaling for better performance. This step is crucial for reasons such as ensuring numerical stability, achieving faster convergence, and ensuring unit independence, etc. Always scale your data before passing it trough a neural network;

  2. One-Hot Encoding: Our target values comprise three classes, represented in a single column by the numbers 0, 1, and 2. For enhanced neural network performance, it's more effective to encode these classes into three distinct columns;

  3. Train-Test Data Split: Using the same dataset for both training and testing won't give us a realistic measure of the model's performance on new, unseen data.

Завдання
test

Swipe to show code editor

Prepare the wine dataset to work with our neural network:

  1. Extract input values from the dataset.
  2. Scale input values.
  3. Split data into train and test sets (40% of data will be used as test data).

Рішення

Switch to desktopПерейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 2. Розділ 5
toggle bottom row

book
Data Preprocessing

Wine Dataset

Now we will try to train our model on more realistic task. There is a wine dataset in scikit-learn library that we will use to predict wine class. We will use 3 input parameters for prediction.

Here you can see how this dataset look like:

12345678910
import pandas as pd # Import pandas to create a DataFrame from loaded dataset from sklearn.datasets import load_wine # Import dataset loading function wine_ds = load_wine() # Load the dataset X = pd.DataFrame(wine_ds.data, columns=wine_ds.feature_names)[['flavanoids', 'proline', 'total_phenols']] # Extract input values from the dataset y = pd.DataFrame(wine_ds.target, columns=['target']) # Extract output values from the dataset # Display the datasets display(X.head()) # `X` is our input values, they are used to predict target value display(pd.DataFrame(y.value_counts())) # `y` is a target value, that we want to predict; it has 3 target classes
copy

To train our model, we'll use three input parameters: flavanoids, proline, and total_phenols. For now, we have chosen these parameters as one of those with the highest correlation. This is done in order to reduce the size of the neural network required for successful training and reduce the time spent on the training process.

Data Preprocessing

Here's how we'll prepare the data for training:

  1. Data Scaling: Neural networks differ from decision trees or random forests in that they require data scaling for better performance. This step is crucial for reasons such as ensuring numerical stability, achieving faster convergence, and ensuring unit independence, etc. Always scale your data before passing it trough a neural network;

  2. One-Hot Encoding: Our target values comprise three classes, represented in a single column by the numbers 0, 1, and 2. For enhanced neural network performance, it's more effective to encode these classes into three distinct columns;

  3. Train-Test Data Split: Using the same dataset for both training and testing won't give us a realistic measure of the model's performance on new, unseen data.

Завдання
test

Swipe to show code editor

Prepare the wine dataset to work with our neural network:

  1. Extract input values from the dataset.
  2. Scale input values.
  3. Split data into train and test sets (40% of data will be used as test data).

Рішення

Switch to desktopПерейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 2. Розділ 5
Switch to desktopПерейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
We're sorry to hear that something went wrong. What happened?
some-alt