Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Descriptive Statistics Fundamentals | Section
Applying Statistical Methods

bookDescriptive Statistics Fundamentals

Stryg for at vise menuen

Descriptive statistics are essential tools for summarizing and understanding the main features of a dataset. They help you quickly grasp the central tendency, variability, and overall distribution of your data. The most common measures include mean, median, mode, variance, and standard deviation.

  • The mean is the arithmetic average of a dataset. It is sensitive to extreme values (outliers) and gives you a sense of the "center" of your data;
  • The median is the middle value when the data are sorted. It is robust to outliers and provides a better measure of central tendency when your data are skewed;
  • The mode is the most frequently occurring value in your dataset. It is useful for categorical data or when you want to identify the most common value;
  • Variance measures how spread out the numbers in your dataset are. It calculates the average squared deviation from the mean;
  • Standard deviation is the square root of the variance and gives you a sense of how much the data typically deviates from the mean, using the same units as the data itself.

In Python, you can calculate these statistics using both built-in functions and libraries like numpy. The built-in sum() and len() functions can help you compute the mean, while numpy provides more efficient and feature-rich methods for all these measures. Understanding these statistics helps you interpret your data, spot unusual values, and choose the right approach for further analysis.

123456789101112131415161718192021222324252627282930313233343536373839
import numpy as np from statistics import mean, median, mode, variance, stdev # Sample dataset data = [12, 15, 12, 18, 19, 12, 17, 21, 22, 15] # Mean mean_value = mean(data) print("Mean:", mean_value) # Median median_value = median(data) print("Median:", median_value) # Mode mode_value = mode(data) print("Mode:", mode_value) # Variance variance_value = variance(data) print("Variance:", variance_value) # Standard deviation stdev_value = stdev(data) print("Standard Deviation:", stdev_value) # Alternatively, using numpy np_mean = np.mean(data) np_median = np.median(data) np_mode = np.argmax(np.bincount(data)) np_variance = np.var(data, ddof=1) # ddof=1 for sample variance np_stdev = np.std(data, ddof=1) # ddof=1 for sample std print("\nUsing numpy:") print("Mean:", np_mean) print("Median:", np_median) print("Mode:", np_mode) print("Variance:", np_variance) print("Standard Deviation:", np_stdev)
copy
question mark

Which of the following statements best describes the difference between mean, median, and mode?

Vælg det korrekte svar

Var alt klart?

Hvordan kan vi forbedre det?

Tak for dine kommentarer!

Sektion 1. Kapitel 1

Spørg AI

expand

Spørg AI

ChatGPT

Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat

Sektion 1. Kapitel 1
some-alt