Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
学ぶ Correlation Analysis | Basic Statistical Analysis
Data Analysis with R

bookCorrelation Analysis

メニューを表示するにはスワイプしてください

Correlation analysis is a statistical technique used to measure the strength and direction of a relationship between two numeric variables. It helps us understand how changes in one variable are associated with changes in another.

What Is Correlation?

A correlation coefficient (usually represented as rr) ranges between -1 and 1 and means:

  • 1: perfect positive correlation;
  • 0: no correlation;
  • −1: perfect negative correlation.

There are several types of correlation methods, but Pearson correlation is the most commonly used for numeric continuous data in R.

Correlation Between Two Variables

You can use the cor() function to compute the correlation coefficient between two variables. All you need is to provide two columns as parameters.

cor(df$selling_price, df$km_driven)

As a result, the function returns a value between -1 and 1.

Correlation Matrix (Multiple Variables)

The same function can be used to examine relationships between multiple variables.

# Select only numeric columns
numeric_df <- df[, c("selling_price", "km_driven", "max_power", "mileage", "engine", "seats")]
# Compute correlation matrix
cor_matrix <- cor(numeric_df, use = "complete.obs")  # Ignores any rows with missing data

The result is stored as a matrix that shows pairwise correlation values between all selected numeric variables.

question mark

A correlation coefficient of -0.9 indicates:

正しい答えを選んでください

すべて明確でしたか?

どのように改善できますか?

フィードバックありがとうございます!

セクション 3.  5

AIに質問する

expand

AIに質問する

ChatGPT

何でも質問するか、提案された質問の1つを試してチャットを始めてください

セクション 3.  5
some-alt