Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Apprendre Joint Distributions and Factorization | Foundations of Probabilistic Graphical Models
Practice
Projects
Quizzes & Challenges
Quizzes
Challenges
/
Probabilistic Graphical Models Essentials

bookJoint Distributions and Factorization

When you model a collection of random variables, the joint probability distribution describes the likelihood of every possible combination of their values. For three binary variables — say, AA, BB, and CC — the joint table must include a probability for each of the eight possible outcomes. As you increase the number of variables, the size of the joint table grows exponentially (doubling with each new binary variable). This quickly becomes infeasible, both to store and to estimate from data.

Probabilistic graphical models solve this problem by allowing you to factorize the joint distribution according to the dependencies shown in a graph. Instead of specifying every entry in the joint table, you break it into smaller, conditional distributions that are much easier to handle. For example, if the graph structure says AA influences BB, and BB influences CC, you can write the joint as P(A)P(BA)P(CB)P(A)P(B|A)P(C|B). This factorization dramatically reduces the number of parameters you need.

1234567891011121314151617181920212223242526272829
import numpy as np import pandas as pd # Define binary variables: 0 = False, 1 = True # Probabilities for P(A) P_A = {0: 0.6, 1: 0.4} # Probabilities for P(B|A) P_B_given_A = { 0: {0: 0.7, 1: 0.3}, # P(B|A=0) 1: {0: 0.2, 1: 0.8} # P(B|A=1) } # Probabilities for P(C|B) P_C_given_B = { 0: {0: 0.9, 1: 0.1}, # P(C|B=0) 1: {0: 0.4, 1: 0.6} # P(C|B=1) } # Compute the full joint table using the factorization P(A)P(B|A)P(C|B) rows = [] for a in [0, 1]: for b in [0, 1]: for c in [0, 1]: prob = P_A[a] * P_B_given_A[a][b] * P_C_given_B[b][c] rows.append({'A': a, 'B': b, 'C': c, 'P(A,B,C)': prob}) joint_table = pd.DataFrame(rows) print(joint_table)
copy

The code above constructs a joint probability table for three binary variables, but instead of listing all eight probabilities directly, it calculates each one using the factorization implied by the graph structure. The graph here has AA as a parent of BB, and BB as a parent of CC. This means:

  • The probability of AA is given by P(A)P(A);
  • The probability of BB depends only on AA, so you use P(BA)P(B|A);
  • The probability of CC depends only on BB, so you use P(CB)P(C|B).

By multiplying these factors for each possible value of AA, BB, and CC, you efficiently fill out the joint table. This approach only requires the probabilities for each factor (one for AA, two for BAB|A, and two for CBC|B), instead of specifying all eight joint probabilities individually. This is the power of factorization—guided by the structure of the graphical model, you reduce complexity and make modeling large systems feasible.

question mark

Suppose you have a graph where AA points to BB, and BB points to CC (A → B → C), as in the example above. Which factorization correctly represents the joint probability for this graph?

Select the correct answer

Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 1. Chapitre 3

Demandez à l'IA

expand

Demandez à l'IA

ChatGPT

Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion

Suggested prompts:

Can you explain how the factorization reduces the number of parameters needed?

How would the joint table look if the variables were not independent?

Can you show how to compute marginal or conditional probabilities from this joint table?

bookJoint Distributions and Factorization

Glissez pour afficher le menu

When you model a collection of random variables, the joint probability distribution describes the likelihood of every possible combination of their values. For three binary variables — say, AA, BB, and CC — the joint table must include a probability for each of the eight possible outcomes. As you increase the number of variables, the size of the joint table grows exponentially (doubling with each new binary variable). This quickly becomes infeasible, both to store and to estimate from data.

Probabilistic graphical models solve this problem by allowing you to factorize the joint distribution according to the dependencies shown in a graph. Instead of specifying every entry in the joint table, you break it into smaller, conditional distributions that are much easier to handle. For example, if the graph structure says AA influences BB, and BB influences CC, you can write the joint as P(A)P(BA)P(CB)P(A)P(B|A)P(C|B). This factorization dramatically reduces the number of parameters you need.

1234567891011121314151617181920212223242526272829
import numpy as np import pandas as pd # Define binary variables: 0 = False, 1 = True # Probabilities for P(A) P_A = {0: 0.6, 1: 0.4} # Probabilities for P(B|A) P_B_given_A = { 0: {0: 0.7, 1: 0.3}, # P(B|A=0) 1: {0: 0.2, 1: 0.8} # P(B|A=1) } # Probabilities for P(C|B) P_C_given_B = { 0: {0: 0.9, 1: 0.1}, # P(C|B=0) 1: {0: 0.4, 1: 0.6} # P(C|B=1) } # Compute the full joint table using the factorization P(A)P(B|A)P(C|B) rows = [] for a in [0, 1]: for b in [0, 1]: for c in [0, 1]: prob = P_A[a] * P_B_given_A[a][b] * P_C_given_B[b][c] rows.append({'A': a, 'B': b, 'C': c, 'P(A,B,C)': prob}) joint_table = pd.DataFrame(rows) print(joint_table)
copy

The code above constructs a joint probability table for three binary variables, but instead of listing all eight probabilities directly, it calculates each one using the factorization implied by the graph structure. The graph here has AA as a parent of BB, and BB as a parent of CC. This means:

  • The probability of AA is given by P(A)P(A);
  • The probability of BB depends only on AA, so you use P(BA)P(B|A);
  • The probability of CC depends only on BB, so you use P(CB)P(C|B).

By multiplying these factors for each possible value of AA, BB, and CC, you efficiently fill out the joint table. This approach only requires the probabilities for each factor (one for AA, two for BAB|A, and two for CBC|B), instead of specifying all eight joint probabilities individually. This is the power of factorization—guided by the structure of the graphical model, you reduce complexity and make modeling large systems feasible.

question mark

Suppose you have a graph where AA points to BB, and BB points to CC (A → B → C), as in the example above. Which factorization correctly represents the joint probability for this graph?

Select the correct answer

Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 1. Chapitre 3
some-alt