Aprende Multiple Regression and Economic Controls

Desliza para mostrar el menú

When you build regression models to understand economic relationships, you often want to know how one variable affects another while holding other factors constant. However, if you leave out important variables that influence both your dependent and independent variables, your estimates can be biased. This issue is called omitted variable bias. For example, suppose you regress inflation only on unemployment, ignoring GDP growth. If GDP growth affects both inflation and unemployment, your estimated effect of unemployment on inflation will be misleading. To avoid this, economists include control variables — additional predictors that help isolate the effect of interest by accounting for other relevant influences.


              1234567891011121314151617181920212223242526272829303132333435363738
            
# Disable colored output for clarity
options(crayon.enabled = FALSE)

# Load the tidyverse package for data manipulation
library(tidyverse)

# Set random seed for reproducibility
set.seed(42)

# Create a simulated data frame with economic variables
# unemployment: random normal values (mean 6, sd 1.2)
# gdp_growth: random normal values (mean 2.5, sd 1)
# inflation: depends on unemployment and gdp_growth plus random noise
econ_data <- tibble(
  unemployment = rnorm(200, 6, 1.2),
  gdp_growth   = rnorm(200, 2.5, 1),
  inflation    = 1.5 +
    0.6 * rnorm(200, 6, 1.2) -
    0.8 * rnorm(200, 2.5, 1) +
    rnorm(200, 0, 0.7)
)

# Fit a multiple regression model predicting inflation
# using unemployment and gdp_growth as predictors
model <- lm(inflation ~ unemployment + gdp_growth, data = econ_data)

# Calculate Variance Inflation Factor (VIF) for each predictor
# VIF helps detect multicollinearity between predictors
X <- model.matrix(model)[, -1]
vif <- sapply(seq_len(ncol(X)), function(j) {
  r2 <- summary(lm(X[, j] ~ X[, -j]))$r.squared
  1 / (1 - r2)
})
names(vif) <- colnames(X)

# Display regression summary and VIF values
summary(model)
vif

In this regression, each coefficient tells you the estimated effect of that variable on inflation, holding the other variables constant. For instance, the coefficient on unemployment shows how much inflation is expected to change for a one-unit increase in unemployment, assuming GDP growth does not change. The coefficient on GDP growth similarly reflects its unique contribution. Including controls like GDP growth helps you interpret the effect of unemployment more accurately, by accounting for other economic forces at play. Controls are chosen based on economic theory and prior evidence that they influence both the dependent variable and the main predictor of interest. The reliability of your interpretation depends on the identification assumptions: you assume that, after controlling for included variables, there are no omitted confounders that bias your results. If this assumption fails, your estimates may still be biased.

¿Todo estuvo claro?

¡Gracias por tus comentarios!

Sección 2. Capítulo 2

Pregunte a AI

Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla

Sección 2. Capítulo 2