Exploratory Data Analysis

In this chapter, we will explore some of our data's features. Specifically, we will see how the target variable is distributed in the following variables: "gender", "relevent_experience", "enrolled_university", "education_level", "major_discipline", "experience", "company_size", "company_type".

Methods description

import matplotlib.pyplot as plt: Imports the pyplot module from the matplotlib library and assigns it an alias plt. pyplot provides a MATLAB-like interface for creating plots and visualizations in Python;
import seaborn as sns: Imports the seaborn library and assigns it an alias sns. Seaborn is a Python visualization library based on matplotlib that provides a high-level interface for drawing attractive statistical graphics;
plt.figure(figsize=[15, 18]): Creates a new figure object with a specified figure size of 15 inches in width and 18 inches in height;
features = [...]: Defines a list of feature names;
plt.subplot(5, 2, n): Divides the figure into a grid of 5 rows and 2 columns, then selects the subplot at position n;
sns.countplot(...): Generates a count plot for the specified feature (x=f) with counts separated by the hue variable (here, "target"), using data from the data DataFrame;
plt.title(...): Sets the title for the subplot with the name of the feature;
plt.tight_layout(): Adjusts the subplot layout to make sure the plot elements fit within the figure area properly;
plt.show(): Displays the plot.

Task

Swipe to start coding

Import matplotlib and seaborn (as sns) libraries.
Plot the following features: "gender", "relevent_experience", "enrolled_university", "education_level", "major_discipline", "experience", "company_size", "company_type".

Solution

Mark tasks as Completed

Switch to desktop for real-world practiceContinue from where you are using one of the options below

Everything was clear?

Thanks for your feedback!

Section 1. Chapter 3

AVAILABLE TO ULTIMATE ONLY