A company active in Big Data and Data Science is offering some courses in order to train potential future employers. Several individuals signup but, after finishing the courses, they often leave and search for a new job elsewhere. Creating and offering these courses obviously take time (and money) to the company. For that reason, in this project we will predict the probability that a candidate will search for a new company after completing the course. 

The data that will be used can be found at the <a 
href="https://www.kaggle.com/datasets/arashnic/hr-analytics-job-change-of-data-scientists"
target="_blank"
style="color: #ff8a00; font-weight: 600; text-decoration: none; transition: 0.3s ease-in-out;"
onmouseover="this.style.color='#ff0000'"
onmouseout="this.style.color='#d27204'">following link</a>. Just couple of remarks that will be useful during our analysis:
- The dataset is imbalanced;
- Most features are categorical (Nominal, Ordinal, Binary), some with high cardinality;
- Missing imputation can be a part of your pipeline as well.

In this project, we are going to understand the career tracks of Data Scientists.


In this project, we are going to understand Logistic Regression.