Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lernen Bivariate Analysis | Section
Data Visualization & EDA

bookBivariate Analysis

Swipe um das Menü anzuzeigen

Bivariate analysis is an essential step in exploratory data analysis (EDA) that focuses on examining the relationship between two variables. This process helps you uncover patterns, trends, or associations that may not be visible when looking at variables individually. By analyzing two variables together, you can identify whether changes in one variable are associated with changes in another, which is crucial for hypothesis generation, feature selection, and deeper understanding of your dataset.

1234567891011121314151617
import pandas as pd # Sample DataFrame data = { "age": [22, 25, 47, 52, 46, 56, 55, 60, 62, 61], "salary": [25000, 32000, 47000, 52000, 48000, 60000, 58000, 62000, 63000, 64000], "department": ["HR", "Finance", "HR", "Engineering", "Engineering", "Finance", "HR", "Engineering", "Finance", "HR"] } df = pd.DataFrame(data) # Select two relevant columns for analysis age = df["age"] salary = df["salary"] # Compute the correlation coefficient between age and salary correlation = df["age"].corr(df["salary"]) print("Correlation between age and salary:", correlation)
copy
1234567
import matplotlib.pyplot as plt import seaborn as sns # Scatter plot using seaborn sns.scatterplot(x="age", y="salary", data=df) plt.title("Seaborn Scatter Plot of Age vs Salary") plt.show()
copy

When interpreting the correlation coefficient, values close to 1 indicate a strong positive relationship, meaning as one variable increases, the other tends to increase as well. Values close to -1 indicate a strong negative relationship, where one variable increases as the other decreases. Values near 0 suggest little or no linear relationship. The scatter plot visually supports this interpretation: a clear upward or downward trend in the points reflects strong correlation, while a cloud of points with no discernible pattern indicates a weak or no relationship.

1234567
# Boxplot to compare salary distribution across departments plt.figure(figsize=(6, 4)) sns.boxplot(x="department", y="salary", data=df) plt.title("Salary Distribution by Department") plt.xlabel("Department") plt.ylabel("Salary") plt.show()
copy
question mark

What is the main purpose of bivariate analysis in exploratory data analysis (EDA)?

Select the correct answer

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 1. Kapitel 22

Fragen Sie AI

expand

Fragen Sie AI

ChatGPT

Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen

Abschnitt 1. Kapitel 22
some-alt