Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Bivariate Analysis | Section
Data Visualization & EDA

bookBivariate Analysis

Sveip for å vise menyen

Bivariate analysis is an essential step in exploratory data analysis (EDA) that focuses on examining the relationship between two variables. This process helps you uncover patterns, trends, or associations that may not be visible when looking at variables individually. By analyzing two variables together, you can identify whether changes in one variable are associated with changes in another, which is crucial for hypothesis generation, feature selection, and deeper understanding of your dataset.

1234567891011121314151617
import pandas as pd # Sample DataFrame data = { "age": [22, 25, 47, 52, 46, 56, 55, 60, 62, 61], "salary": [25000, 32000, 47000, 52000, 48000, 60000, 58000, 62000, 63000, 64000], "department": ["HR", "Finance", "HR", "Engineering", "Engineering", "Finance", "HR", "Engineering", "Finance", "HR"] } df = pd.DataFrame(data) # Select two relevant columns for analysis age = df["age"] salary = df["salary"] # Compute the correlation coefficient between age and salary correlation = df["age"].corr(df["salary"]) print("Correlation between age and salary:", correlation)
copy
1234567
import matplotlib.pyplot as plt import seaborn as sns # Scatter plot using seaborn sns.scatterplot(x="age", y="salary", data=df) plt.title("Seaborn Scatter Plot of Age vs Salary") plt.show()
copy

When interpreting the correlation coefficient, values close to 1 indicate a strong positive relationship, meaning as one variable increases, the other tends to increase as well. Values close to -1 indicate a strong negative relationship, where one variable increases as the other decreases. Values near 0 suggest little or no linear relationship. The scatter plot visually supports this interpretation: a clear upward or downward trend in the points reflects strong correlation, while a cloud of points with no discernible pattern indicates a weak or no relationship.

1234567
# Boxplot to compare salary distribution across departments plt.figure(figsize=(6, 4)) sns.boxplot(x="department", y="salary", data=df) plt.title("Salary Distribution by Department") plt.xlabel("Department") plt.ylabel("Salary") plt.show()
copy
question mark

What is the main purpose of bivariate analysis in exploratory data analysis (EDA)?

Select the correct answer

Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 1. Kapittel 22

Spør AI

expand

Spør AI

ChatGPT

Spør om hva du vil, eller prøv ett av de foreslåtte spørsmålene for å starte chatten vår

Seksjon 1. Kapittel 22
some-alt