Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
学ぶ Understanding Probability Distributions | Probability & Statistics
Mathematics for Data Science with Python

bookUnderstanding Probability Distributions

メニューを表示するにはスワイプしてください

Probability distributions

A probability distribution tells you how likely different outcomes are. On the one hand, in discrete outcomes (like "how many defective rods"), we list probabilities for each possible count. For continuous measurements (like length or weight), on the other hand, we describe density across a range. General discrete vs continuous formulas:

P(XA)=xAp(x)(discrete)P(aXb)=abf(x)dx(continious)P(X \in A) = \sum_{x \in A}p(x)\quad(\text{discrete}) \\[6pt] P(a \le X \le b) = \int_a^b f(x)dx \quad (continious)

Example (quick check): If a process guarantees all lengths between 49.5 and 50.5 cm are equally likely, the probability a rod lies in a 0.4 cm sub-range will be the sub-range width divided by 1.0 cm (this is the uniform idea — below we show it in detail).

Binomial distribution

The binomial models the number of successes (e.g., defective rods) in a fixed number of independent trials (e.g., 100 rods), when each trial has the same probability of success.

Formula:

P(X=k)=(nk)pk(1p)nkP(X = k) = \begin{pmatrix}n\\k\end{pmatrix}p^k(1-p)^{n-k}

Example:

In a batch of n=100n=100 rods where each rod independently has probability p=0.02p=0.02 of being defective, what is the probability of exactly k=3k=3 defective rods?

Step 1 — compute the combination:

(1003)=100!3!97!=161700\begin{pmatrix}100 \\ 3\end{pmatrix} = \frac{100!}{3!97!} = 161700

Step 2 — compute powers:

p3=0.023=0.000008(1p)97=0.98970.1409059532p^3 = 0.02^3 = 0.000008 \\ (1-p)^{97} = 0.98^{97} \approx 0.1409059532

Step 3 — multiply all parts:

P(X=3)=161700×0.000008×0.14090595320.182275941P(X = 3) = 161700 \times 0.000008 \times 0.1409059532 \approx 0.182275941

What this means: About 18.23% chance of exactly 3 defective rods in a 100-rod sample. If you see 3 defects, that is a plausible outcome.

Note
Note

If your computed probability seems larger than 1 or negative, re-check the combination or the power calculations. Also compare a binomial pmf value to the cdf if you want "at most" or "at least" answers.

Uniform distribution

The uniform distribution models a continuous measurement where every value within a range [a,b] is equally likely (e.g., a tolerance range for rod length).

Formula:

f(x)=1ba,axbf(x) = \frac{1}{b-a},\quad a \le x \le b

Probability between two points:

P(lXu)=ulbaP(l \le X \le u) = \frac{u - l}{b - a}

Example:

Parameters: a=49.5, b=50.5. What is the probability a rod length X lies between 49.8 and 50.2? Compute range width:

ba=50.549.5=1.0b-a = 50.5 - 49.5 = 1.0

Compute sub-interval:

ul=50.249.8=0.4u - l = 50.2 - 49.8 = 0.4

Probability:

P(49.8X50.2)=0.41.0=0.4P(49.8 \le X \le 50.2) = \frac{0.4}{1.0} = 0.4

Interpretation: There is a 40% chance a randomly measured rod will fall in this tighter tolerance.

Note
Note

Make sure a<ba<b and your sub-range is inside [a,b][a,b]; otherwise you must clip the endpoints and treat outside ranges with probability 0.

Normal distribution

The normal distribution describes continuous measurements that cluster around a mean μμ with spread measured by standard deviation σσ. Many measurement errors and natural variations follow this bell-shaped curve.

Formula:

f(x)=1σ2πe(xμ)22σ2f(x) = \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}

Standardize with z-score:

z=xμσz = \frac{x-\mu}{\sigma}

Probability between two values uses the cumulative distribution (CDF) or symmetry for standard cases:

P(aXb)=Φ(bμσ)Φ(aμσ)P(a \le X \le b) = \Phi\left(\frac{b-\mu}{\sigma}\right) - \Phi\left(\frac{a-\mu}{\sigma}\right)

Here Φ\Phi is the standard normal CDF.

Example A:

Parameters: μ=200μ=200, σ=5σ=5, find P(195X205)P(195≤X≤205).

Z-scores:

z1=1952005=1z2=2052005=1z_1 = \frac{195 - 200}{5} = -1 \\[6pt] z_2 = \frac{205 - 200}{5} = 1

Using the symmetry of the normal distribution, the probability between 1−1 and +1+1 standard deviation is the well-known:

P(195X205)0.6826894921P(195 \le X \le 205) \approx 0.6826894921

Interpretation: About 68.27% of rod weights fall within ±1 standard deviation of the mean — a classic "68% rule".

Note
Note

When the bounds are symmetric around use known empirical rules (689599.768–95–99.7). For other bounds, compute then use a table or calculator.

question mark

Z-score for X=195X=195, μ=200μ=200, σ=5σ=5?

正しい答えを選んでください

すべて明確でしたか?

どのように改善できますか?

フィードバックありがとうございます!

セクション 5.  10

AIに質問する

expand

AIに質問する

ChatGPT

何でも質問するか、提案された質問の1つを試してチャットを始めてください

セクション 5.  10
some-alt