Correlation Analysis
メニューを表示するにはスワイプしてください
Correlation analysis is a statistical technique used to measure the strength and direction of a relationship between two numeric variables. It helps us understand how changes in one variable are associated with changes in another.
What Is Correlation?
A correlation coefficient (usually represented as r) ranges between -1 and 1 and means:
- 1: perfect positive correlation;
- 0: no correlation;
- −1: perfect negative correlation.
There are several types of correlation methods, but Pearson correlation is the most commonly used for numeric continuous data in R.
Correlation Between Two Variables
You can use the cor() function to compute the correlation coefficient between two variables. All you need is to provide two columns as parameters.
cor(df$selling_price, df$km_driven)
As a result, the function returns a value between -1 and 1.
Correlation Matrix (Multiple Variables)
The same function can be used to examine relationships between multiple variables.
# Select only numeric columns
numeric_df <- df[, c("selling_price", "km_driven", "max_power", "mileage", "engine", "seats")]
# Compute correlation matrix
cor_matrix <- cor(numeric_df, use = "complete.obs") # Ignores any rows with missing data
The result is stored as a matrix that shows pairwise correlation values between all selected numeric variables.
フィードバックありがとうございます!
AIに質問する
AIに質問する
何でも質問するか、提案された質問の1つを試してチャットを始めてください