Aprenda ARIMA Model Structure | Mathematical Foundations of ARIMA

To understand how ARIMA models work, you need to see how three key components—autoregressive (AR), integrated (I), and moving average (MA)—combine to model a wide range of time series data. The ARIMA model is denoted as ARIMA(p, d, q), where p, d, and q are integers that describe the structure of the model. Each parameter has a specific role: p controls the AR part, d determines the number of times the data is differenced to achieve stationarity, and q sets the order of the MA part.

The AR (autoregressive) component models the relationship between an observation and a number of lagged observations. The MA (moving average) component models the relationship between an observation and a residual error from a moving average model applied to lagged observations. The integrated (I) part involves differencing the data, which means subtracting the previous observation from the current one; this helps to stabilize the mean of a time series by removing changes in the level of a time series, and thus eliminating trend and seasonality.

You should use differencing (d > 0) when your time series is non-stationary, meaning its statistical properties change over time. Applying the correct degree of differencing can make the series stationary, which is crucial for ARIMA models to perform effectively. If you over-difference, you may introduce unnecessary complexity and noise; if you under-difference, the model may not capture the underlying structure.

ARIMA Parameter Definitions:

p (AR order): The number of lag observations included in the model; controls how many past values are used to predict the current value.
d (degree of differencing): The number of times the raw observations are differenced to achieve stationarity; helps remove trends and seasonality.
q (MA order): The number of lagged forecast errors in the prediction equation; controls how many past errors are used to predict the current value.


              1234567891011121314151617181920
            
import numpy as np
import pandas as pd

np.random.seed(42)

# Simulate an ARIMA(1,1,1) process
n = 100
ar_coef = 0.7
ma_coef = 0.5

e = np.random.normal(0, 1, n+1)
y = [0]

for t in range(1, n+1):
    diff = y[-1] if t > 1 else 0
    val = diff + ar_coef * (y[-1] - diff) + e[t] + ma_coef * e[t-1]
    y.append(val)

series = pd.Series(y[1:])
print(series.head())

1. Fill in the blanks to identify the ARIMA parameters from the following model description: A time series model uses the last 2 observations, takes the first difference of the series, and incorporates the last 3 error terms.

2. Which of the following best describes the impact of differencing in an ARIMA model?

Tudo estava claro?

Obrigado pelo seu feedback!

Seção 2. Capítulo 3

Pergunte à IA

Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo

Deslize para mostrar o menu