Course Content
Advanced Techniques in pandas
Advanced Techniques in pandas
Finding the Correlation
Finally, let's move to the last method of this section called .corr()
. It helps out a lot to find the relationship between numerical data. Imagine that you have a dataset on houses:
Let's examine the output of the data.corr()
in our case:
So, let's do it step by step: You have vertical and horizontal values; each pair overlaps. In each overlap, we can receive a value from -1 to 1.
- 1 means that two values depend on each other in a directly proportional way (if one value increases, the other increases too);
- -1 means that two values depend on each other in an inversely proportional way (if one value increases, the other decreases);
- 0 means that the two dependent values aren't proportional.
Note
If the dataset contains non-numeric columns, such as in the
cars.csv
dataset used in the task, you should set the argumentnumeric_only=True
to compute the correlation using only the numeric columns.
Swipe to show code editor
You'll end this section with an effortless task: apply the .corr()
function to the dataset. Then, try to analyze the numbers you get.
Thanks for your feedback!
Finding the Correlation
Finally, let's move to the last method of this section called .corr()
. It helps out a lot to find the relationship between numerical data. Imagine that you have a dataset on houses:
Let's examine the output of the data.corr()
in our case:
So, let's do it step by step: You have vertical and horizontal values; each pair overlaps. In each overlap, we can receive a value from -1 to 1.
- 1 means that two values depend on each other in a directly proportional way (if one value increases, the other increases too);
- -1 means that two values depend on each other in an inversely proportional way (if one value increases, the other decreases);
- 0 means that the two dependent values aren't proportional.
Note
If the dataset contains non-numeric columns, such as in the
cars.csv
dataset used in the task, you should set the argumentnumeric_only=True
to compute the correlation using only the numeric columns.
Swipe to show code editor
You'll end this section with an effortless task: apply the .corr()
function to the dataset. Then, try to analyze the numbers you get.
Thanks for your feedback!
Finding the Correlation
Finally, let's move to the last method of this section called .corr()
. It helps out a lot to find the relationship between numerical data. Imagine that you have a dataset on houses:
Let's examine the output of the data.corr()
in our case:
So, let's do it step by step: You have vertical and horizontal values; each pair overlaps. In each overlap, we can receive a value from -1 to 1.
- 1 means that two values depend on each other in a directly proportional way (if one value increases, the other increases too);
- -1 means that two values depend on each other in an inversely proportional way (if one value increases, the other decreases);
- 0 means that the two dependent values aren't proportional.
Note
If the dataset contains non-numeric columns, such as in the
cars.csv
dataset used in the task, you should set the argumentnumeric_only=True
to compute the correlation using only the numeric columns.
Swipe to show code editor
You'll end this section with an effortless task: apply the .corr()
function to the dataset. Then, try to analyze the numbers you get.
Thanks for your feedback!
Finally, let's move to the last method of this section called .corr()
. It helps out a lot to find the relationship between numerical data. Imagine that you have a dataset on houses:
Let's examine the output of the data.corr()
in our case:
So, let's do it step by step: You have vertical and horizontal values; each pair overlaps. In each overlap, we can receive a value from -1 to 1.
- 1 means that two values depend on each other in a directly proportional way (if one value increases, the other increases too);
- -1 means that two values depend on each other in an inversely proportional way (if one value increases, the other decreases);
- 0 means that the two dependent values aren't proportional.
Note
If the dataset contains non-numeric columns, such as in the
cars.csv
dataset used in the task, you should set the argumentnumeric_only=True
to compute the correlation using only the numeric columns.
Swipe to show code editor
You'll end this section with an effortless task: apply the .corr()
function to the dataset. Then, try to analyze the numbers you get.