Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Aprenda Descriptive Statistics Fundamentals | Section
Applying Statistical Methods

bookDescriptive Statistics Fundamentals

Deslize para mostrar o menu

Descriptive statistics are essential tools for summarizing and understanding the main features of a dataset. They help you quickly grasp the central tendency, variability, and overall distribution of your data. The most common measures include mean, median, mode, variance, and standard deviation.

  • The mean is the arithmetic average of a dataset. It is sensitive to extreme values (outliers) and gives you a sense of the "center" of your data;
  • The median is the middle value when the data are sorted. It is robust to outliers and provides a better measure of central tendency when your data are skewed;
  • The mode is the most frequently occurring value in your dataset. It is useful for categorical data or when you want to identify the most common value;
  • Variance measures how spread out the numbers in your dataset are. It calculates the average squared deviation from the mean;
  • Standard deviation is the square root of the variance and gives you a sense of how much the data typically deviates from the mean, using the same units as the data itself.

In Python, you can calculate these statistics using both built-in functions and libraries like numpy. The built-in sum() and len() functions can help you compute the mean, while numpy provides more efficient and feature-rich methods for all these measures. Understanding these statistics helps you interpret your data, spot unusual values, and choose the right approach for further analysis.

123456789101112131415161718192021222324252627282930313233343536373839
import numpy as np from statistics import mean, median, mode, variance, stdev # Sample dataset data = [12, 15, 12, 18, 19, 12, 17, 21, 22, 15] # Mean mean_value = mean(data) print("Mean:", mean_value) # Median median_value = median(data) print("Median:", median_value) # Mode mode_value = mode(data) print("Mode:", mode_value) # Variance variance_value = variance(data) print("Variance:", variance_value) # Standard deviation stdev_value = stdev(data) print("Standard Deviation:", stdev_value) # Alternatively, using numpy np_mean = np.mean(data) np_median = np.median(data) np_mode = np.argmax(np.bincount(data)) np_variance = np.var(data, ddof=1) # ddof=1 for sample variance np_stdev = np.std(data, ddof=1) # ddof=1 for sample std print("\nUsing numpy:") print("Mean:", np_mean) print("Median:", np_median) print("Mode:", np_mode) print("Variance:", np_variance) print("Standard Deviation:", np_stdev)
copy
question mark

Which of the following statements best describes the difference between mean, median, and mode?

Selecione a resposta correta

Tudo estava claro?

Como podemos melhorá-lo?

Obrigado pelo seu feedback!

Seção 1. Capítulo 1

Pergunte à IA

expand

Pergunte à IA

ChatGPT

Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo

Seção 1. Capítulo 1
some-alt