Training Set

If we talk about supervised or unsupervised learning, the training set will usually be in a table form.

Consider the diabetes dataset, which has the task of predicting whether a person has diabetes. It holds information about 768 females with parameters like age, body mass index, blood pressure, etc. These parameters are called features.

The dataset also contains information on whether the person has diabetes in an 'Outcome' column, which is what we want to predict. It is called target.

Each row in a table is called instance(or data point or sample). In this case, it is information about one female.

The table (training set) has a target column in it, which means it is labeled.

The task is to train the ML model on this training set, and once it is trained, it can predict for other people (new instances) whether they have diabetes based on features only.

While coding, feature columns are usually assigned to X and target columns assigned as y.

And features of new instances are assigned as X_new.

Все було зрозуміло?

Дякуємо за ваш відгук!

Секція 1. Розділ 3

Запитати АІ

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

Зміст курсу

ML Introduction with scikit-learn

1. Machine Learning Concepts

What is ML Types of Machine Learning Training Set Types of Data Machine Learning Workflow

2. Preprocessing Data with Scikit-learn

3. Pipelines

What is Pipeline ColumnTransformer Efficient Data Preprocessing with Pipelines Challenge: Creating a Pipeline Final Estimator Challenge: Creating a Complete ML Pipeline

4. Modeling