Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Find the Correlation | Extract Data
Advanced Techniques in pandas

Find the Correlation

Finally, let's move to the last function of this section called .corr(). It helps out a lot to find the relationship between numerical data. Imagine that you have a dataset on houses:

Price USD Number of Rooms Distance from the City Center in km
329000 4 25
8739000 6 3
1268000 6 2
987000 4 10
103000 2 30

Let's examine the output of the data.corr() in our case:

Price USD Number of Rooms Distance from the City Center in km
Price USD 1.000000 0.625651 -0.589396
Number of Rooms 0.625651 1.000000 -0.908600
Distance from the City Center in km -0.589396 -0.908600 1.000000

So, let's do it step by step: You have vertical and horizontal values; each pair overlaps. In each overlap, we can receive a value from -1 to 1.

  • 1 means that two values depend on each other in a directly proportional way (if one value increases, the other increases too);
  • -1 means that two values depend on each other in an inversely proportional way (if one value increases, the other decreases);
  • 0 means that the two dependent values aren't proportional.

Task

You'll end this section with an effortless task: apply the .corr() function to the dataset. Then, try to analyze the numbers you get.

Everything was clear?

Section 3. Chapter 7
toggle bottom row
course content

Course Content

Advanced Techniques in pandas

Find the Correlation

Finally, let's move to the last function of this section called .corr(). It helps out a lot to find the relationship between numerical data. Imagine that you have a dataset on houses:

Price USD Number of Rooms Distance from the City Center in km
329000 4 25
8739000 6 3
1268000 6 2
987000 4 10
103000 2 30

Let's examine the output of the data.corr() in our case:

Price USD Number of Rooms Distance from the City Center in km
Price USD 1.000000 0.625651 -0.589396
Number of Rooms 0.625651 1.000000 -0.908600
Distance from the City Center in km -0.589396 -0.908600 1.000000

So, let's do it step by step: You have vertical and horizontal values; each pair overlaps. In each overlap, we can receive a value from -1 to 1.

  • 1 means that two values depend on each other in a directly proportional way (if one value increases, the other increases too);
  • -1 means that two values depend on each other in an inversely proportional way (if one value increases, the other decreases);
  • 0 means that the two dependent values aren't proportional.

Task

You'll end this section with an effortless task: apply the .corr() function to the dataset. Then, try to analyze the numbers you get.

Everything was clear?

Section 3. Chapter 7
toggle bottom row
some-alt