MCMC Algorithms: Intuition and Steps
When you need to generate samples from a probability distribution, but direct sampling is impractical or impossible, Markov Chain Monte Carlo (MCMC) methods offer a powerful solution. The central goal of MCMC is to construct a Markov chain whose stationary distribution matches the target distribution you want to sample from. Over time, samples drawn from this chain approximate samples from the true distribution, even if the distribution is high-dimensional or has a complex shape.
The Metropolis-Hastings algorithm is one of the most widely used MCMC methods. Here’s how it works, step by step:
- Start with an initial point, often chosen randomly or based on prior knowledge;
- Propose a new sample by perturbing the current point, using a proposal distribution (such as a normal distribution centered on the current point);
- Calculate the acceptance ratio, which compares the probability of the proposed point to the current point, adjusted for the proposal distribution;
- Accept the proposed point with a probability equal to the acceptance ratio. If not accepted, stay at the current point;
- Repeat steps 2–4 for many iterations, building a sequence (chain) of samples.
This iterative process allows you to explore the target distribution, even when it is not possible to sample from it directly. The quality and efficiency of the sampling depend on the choice of proposal distribution and how well the Markov chain explores the space.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566import math import random random.seed(42) def target_density(x: float) -> float: """Unnormalized target density: mixture of two Gaussians.""" a = math.exp(-0.5 * ((x + 2) / 1.0) ** 2) # N(-2, 1) b = 0.7 * math.exp(-0.5 * ((x - 2) / 0.6) ** 2) # 0.7 * N(2, 0.6) return a + b def propose_new_state(current: float, step_scale: float = 1.0) -> float: """Random-walk proposal: x' = x + Normal(0, step_scale).""" return current + random.gauss(0.0, step_scale) def metropolis_hastings( initial_state: float, num_steps: int, step_scale: float = 1.0, print_every: int = 1 ): current = initial_state samples = [] # Symmetric proposal -> q cancels out for step in range(1, num_steps + 1): proposal = propose_new_state(current, step_scale) p_current = target_density(current) p_proposal = target_density(proposal) # For symmetric proposal: acceptance_ratio = p_proposal / p_current acceptance_ratio = p_proposal / p_current if p_current > 0 else float("inf") acceptance_prob = min(1.0, acceptance_ratio) u = random.random() accepted = u < acceptance_prob if accepted: current = proposal samples.append(current) if step % print_every == 0: status = "ACCEPT" if accepted else "REJECT" bar_len = 20 filled = int(round(acceptance_prob * bar_len)) bar = "█" * filled + "·" * (bar_len - filled) print( f"step {step:03d} | {status:6} | " f"current={current:7.3f} | proposal={proposal:7.3f} | " f"p_cur={p_current:8.5f} | p_prop={p_proposal:8.5f} | " f"α={acceptance_prob:6.3f} [{bar}] | u={u:6.3f}" ) return samples samples = metropolis_hastings( initial_state=0.0, num_steps=30, step_scale=1.0, print_every=1 ) print("\nFirst 10 samples:", [round(s, 3) for s in samples[:10]]) print("Last 10 samples:", [round(s, 3) for s in samples[-10:]])
When working with MCMC, you need to consider three important concepts: mixing, burn-in, and convergence. Mixing refers to how quickly the Markov chain explores all relevant regions of the target distribution. Good mixing means the chain moves efficiently and does not get stuck in one area. Burn-in is the initial set of samples you discard because the chain has not yet reached the stationary distribution — these early samples may be biased by the starting point. Convergence is the point at which the distribution of your samples closely matches the true target distribution; only after convergence do your samples become reliable for inference.
Despite its power, MCMC comes with some common pitfalls and assumptions. The chain must be able to reach every part of the distribution (ergodicity), and the proposal distribution must be chosen carefully to ensure good mixing. If the chain mixes poorly or you underestimate the burn-in period, your samples may not represent the true distribution. Additionally, MCMC assumes you can evaluate the target distribution up to a normalizing constant, but not always sample from it directly. Always check diagnostics to ensure your MCMC run is valid and your assumptions hold.
Bedankt voor je feedback!
Vraag AI
Vraag AI
Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.
Geweldig!
Completion tarief verbeterd naar 11.11
MCMC Algorithms: Intuition and Steps
Veeg om het menu te tonen
When you need to generate samples from a probability distribution, but direct sampling is impractical or impossible, Markov Chain Monte Carlo (MCMC) methods offer a powerful solution. The central goal of MCMC is to construct a Markov chain whose stationary distribution matches the target distribution you want to sample from. Over time, samples drawn from this chain approximate samples from the true distribution, even if the distribution is high-dimensional or has a complex shape.
The Metropolis-Hastings algorithm is one of the most widely used MCMC methods. Here’s how it works, step by step:
- Start with an initial point, often chosen randomly or based on prior knowledge;
- Propose a new sample by perturbing the current point, using a proposal distribution (such as a normal distribution centered on the current point);
- Calculate the acceptance ratio, which compares the probability of the proposed point to the current point, adjusted for the proposal distribution;
- Accept the proposed point with a probability equal to the acceptance ratio. If not accepted, stay at the current point;
- Repeat steps 2–4 for many iterations, building a sequence (chain) of samples.
This iterative process allows you to explore the target distribution, even when it is not possible to sample from it directly. The quality and efficiency of the sampling depend on the choice of proposal distribution and how well the Markov chain explores the space.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566import math import random random.seed(42) def target_density(x: float) -> float: """Unnormalized target density: mixture of two Gaussians.""" a = math.exp(-0.5 * ((x + 2) / 1.0) ** 2) # N(-2, 1) b = 0.7 * math.exp(-0.5 * ((x - 2) / 0.6) ** 2) # 0.7 * N(2, 0.6) return a + b def propose_new_state(current: float, step_scale: float = 1.0) -> float: """Random-walk proposal: x' = x + Normal(0, step_scale).""" return current + random.gauss(0.0, step_scale) def metropolis_hastings( initial_state: float, num_steps: int, step_scale: float = 1.0, print_every: int = 1 ): current = initial_state samples = [] # Symmetric proposal -> q cancels out for step in range(1, num_steps + 1): proposal = propose_new_state(current, step_scale) p_current = target_density(current) p_proposal = target_density(proposal) # For symmetric proposal: acceptance_ratio = p_proposal / p_current acceptance_ratio = p_proposal / p_current if p_current > 0 else float("inf") acceptance_prob = min(1.0, acceptance_ratio) u = random.random() accepted = u < acceptance_prob if accepted: current = proposal samples.append(current) if step % print_every == 0: status = "ACCEPT" if accepted else "REJECT" bar_len = 20 filled = int(round(acceptance_prob * bar_len)) bar = "█" * filled + "·" * (bar_len - filled) print( f"step {step:03d} | {status:6} | " f"current={current:7.3f} | proposal={proposal:7.3f} | " f"p_cur={p_current:8.5f} | p_prop={p_proposal:8.5f} | " f"α={acceptance_prob:6.3f} [{bar}] | u={u:6.3f}" ) return samples samples = metropolis_hastings( initial_state=0.0, num_steps=30, step_scale=1.0, print_every=1 ) print("\nFirst 10 samples:", [round(s, 3) for s in samples[:10]]) print("Last 10 samples:", [round(s, 3) for s in samples[-10:]])
When working with MCMC, you need to consider three important concepts: mixing, burn-in, and convergence. Mixing refers to how quickly the Markov chain explores all relevant regions of the target distribution. Good mixing means the chain moves efficiently and does not get stuck in one area. Burn-in is the initial set of samples you discard because the chain has not yet reached the stationary distribution — these early samples may be biased by the starting point. Convergence is the point at which the distribution of your samples closely matches the true target distribution; only after convergence do your samples become reliable for inference.
Despite its power, MCMC comes with some common pitfalls and assumptions. The chain must be able to reach every part of the distribution (ergodicity), and the proposal distribution must be chosen carefully to ensure good mixing. If the chain mixes poorly or you underestimate the burn-in period, your samples may not represent the true distribution. Additionally, MCMC assumes you can evaluate the target distribution up to a normalizing constant, but not always sample from it directly. Always check diagnostics to ensure your MCMC run is valid and your assumptions hold.
Bedankt voor je feedback!