Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Creating Scatter Plots | Data Visualization
Data Analysis with R

bookCreating Scatter Plots

Why Use Scatter Plots?

A scatter plot is ideal for visualizing relationships between variables. It can be used to:

  • Show relationships between two numerical variables;
  • Detect patterns, clusters, or outliers;
  • Explore correlation (positive/negative/none).

Scatter Plot Syntax in ggplot2

You can create a scatter plot with geom_point(). To do this, specify the aesthetics for both x and y axes.

ggplot(data = df, aes(x = variable_x, y = variable_y)) +
  geom_point()

To distinguish groups within the data, you can add a grouping variable to the color aesthetic. This assigns different colors to each group, making patterns easier to spot.

ggplot(data = df, aes(x = variable_x, y = variable_y, color = group_var)) +
  geom_point()

Example: Selling Price vs. Kilometers Driven

A scatter plot can be used to examine how a car's usage relates to its selling price. In this example, the x-axis shows the number of kilometers driven, while the y-axis shows the selling price.

ggplot(df, aes(x = km_driven, y = selling_price)) +
  geom_point() +
  labs(title = "Scatter Plot of Selling Price vs. Kilometers Driven",
       x = "Kilometers Driven",
       y = "Selling Price")

This visualization often highlights depreciation trends - as mileage increases, selling price typically decreases. It can also reveal outliers, such as cars with unusually high prices despite high mileage.

question mark

Which ggplot2 function creates a scatter plot?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 2. ChapterΒ 5

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Awesome!

Completion rate improved to 4

bookCreating Scatter Plots

Swipe to show menu

Why Use Scatter Plots?

A scatter plot is ideal for visualizing relationships between variables. It can be used to:

  • Show relationships between two numerical variables;
  • Detect patterns, clusters, or outliers;
  • Explore correlation (positive/negative/none).

Scatter Plot Syntax in ggplot2

You can create a scatter plot with geom_point(). To do this, specify the aesthetics for both x and y axes.

ggplot(data = df, aes(x = variable_x, y = variable_y)) +
  geom_point()

To distinguish groups within the data, you can add a grouping variable to the color aesthetic. This assigns different colors to each group, making patterns easier to spot.

ggplot(data = df, aes(x = variable_x, y = variable_y, color = group_var)) +
  geom_point()

Example: Selling Price vs. Kilometers Driven

A scatter plot can be used to examine how a car's usage relates to its selling price. In this example, the x-axis shows the number of kilometers driven, while the y-axis shows the selling price.

ggplot(df, aes(x = km_driven, y = selling_price)) +
  geom_point() +
  labs(title = "Scatter Plot of Selling Price vs. Kilometers Driven",
       x = "Kilometers Driven",
       y = "Selling Price")

This visualization often highlights depreciation trends - as mileage increases, selling price typically decreases. It can also reveal outliers, such as cars with unusually high prices despite high mileage.

question mark

Which ggplot2 function creates a scatter plot?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 2. ChapterΒ 5
some-alt