Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
学ぶ Creating Histograms | Data Visualization
Data Analysis with R

bookCreating Histograms

メニューを表示するにはスワイプしてください

Why Use Histograms?

Histograms are used to visualize the distribution of continuous (numerical) data. They show how data is spread across ranges (bins) and help to:

  • Detect skewness, outliers, or gaps;
  • Understand frequency distribution;
  • Quickly assess if the data is normally distributed or not.

They are best used for variables like price, mileage, or age.

Histogram Syntax in ggplot2

You can create a histogram using geom_histogram(), where the x variable must be numeric.

ggplot(data = df, aes(x = variable)) +
  geom_histogram()

The appearance of the histogram can be customized using arguments such as bins (number of bins), fill (bar color), color (border color), and theme for styling.

Example: Distribution of Selling Prices

A histogram can be used to examine how car prices are distributed across the dataset. In this example, the bars are filled with steel blue and outlined in black, while labels and a minimal theme are added for clarity.

ggplot(data = df, aes(x = selling_price)) +
  geom_histogram(fill = "steelblue", color = "black") +
  labs(title = "Distribution of Selling Prices",
       x = "Selling Price (in PKR)",
       y = "Count") +
  theme_minimal()

This plot reveals the overall shape of the selling price distribution, making it easy to see whether most cars fall within a particular price range or if there are outliers at the high or low end.

question mark

What does the bins argument in geom_histogram() control?

正しい答えを選んでください

すべて明確でしたか?

どのように改善できますか?

フィードバックありがとうございます!

セクション 2.  3

AIに質問する

expand

AIに質問する

ChatGPT

何でも質問するか、提案された質問の1つを試してチャットを始めてください

セクション 2.  3
some-alt