Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Вивчайте Comparing ARIMA-Based Models | Advanced ARIMA Techniques and Model Selection
Time Series Forecasting with ARIMA

bookComparing ARIMA-Based Models

When you work with time series data, choosing the right forecasting model is crucial for achieving accurate predictions. As you have learned, ARIMA models are powerful tools for modeling non-seasonal time series, while SARIMA extends ARIMA's capabilities to handle seasonality. Additionally, Auto ARIMA automates the process of parameter selection. To determine which model works best for your dataset, you need to compare them using systematic strategies.

A common approach is to use cross-validation, where you repeatedly split your time series into training and testing sets, fit models to the training data, and evaluate their performance on the test data. However, because time series data is ordered, you must use techniques like rolling-origin or expanding window validation, which respect the temporal order.

Another essential strategy is metric-based selection. Here, you fit candidate models—such as ARIMA, SARIMA, and Auto ARIMA—to your training data, generate forecasts, and then compare their accuracy using quantitative metrics. The most widely used metrics include Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). Lower values of these metrics indicate better forecasting performance. Comparing these values across models helps you select the one that generalizes best to unseen data.

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647
import warnings import pandas as pd import numpy as np from statsmodels.tsa.arima.model import ARIMA from statsmodels.tsa.statespace.sarimax import SARIMAX from sklearn.metrics import mean_absolute_error, mean_squared_error # Suppress optimization logs and warnings warnings.filterwarnings("ignore") # Generate synthetic monthly data with seasonality np.random.seed(42) periods = 60 time = np.arange(periods) seasonal = 10 + 3 * np.sin(2 * np.pi * time / 12) trend = 0.3 * time noise = np.random.normal(scale=2, size=periods) data = seasonal + trend + noise ts = pd.Series(data, index=pd.date_range("2020-01-01", periods=periods, freq="M")) # Split into train and test sets train = ts[:48] test = ts[48:] # Fit ARIMA model (no seasonal order) arima_model = ARIMA(train, order=(2, 1, 2)).fit() arima_forecast = arima_model.forecast(steps=len(test)) # Fit SARIMA model (with seasonal order) sarima_model = SARIMAX(train, order=(2, 1, 2), seasonal_order=(1, 1, 1, 12)).fit(disp=False) sarima_forecast = sarima_model.forecast(steps=len(test)) # Evaluate forecasts arima_mae = mean_absolute_error(test, arima_forecast) arima_rmse = np.sqrt(mean_squared_error(test, arima_forecast)) sarima_mae = mean_absolute_error(test, sarima_forecast) sarima_rmse = np.sqrt(mean_squared_error(test, sarima_forecast)) # Compare visually import matplotlib.pyplot as plt plt.figure(figsize=(10, 5)) plt.plot(ts, label="Actual") plt.plot(test.index, arima_forecast, label=f"ARIMA Forecast (MAE: {arima_mae:.2f} RMSE: {arima_rmse:.2f})", color="orange") plt.plot(test.index, sarima_forecast, label=f"SARIMA Forecast (MAE: {sarima_mae:.2f} RMSE: {sarima_rmse:.2f})", color="green") plt.legend() plt.title("ARIMA vs SARIMA Forecast Comparison") plt.show()
copy

After running the comparison, you can interpret the results by looking at the MAE and RMSE values for each model. The model with the lowest error metrics is generally preferred, especially if the difference is substantial. However, you should also consider the complexity of the model and whether it captures the underlying structure of the data, such as seasonality. If SARIMA achieves lower errors than ARIMA on data with seasonal patterns, it suggests that modeling seasonality improved forecast accuracy.

question mark

Which criterion should you primarily use to select the best ARIMA-based forecasting model when comparing ARIMA, SARIMA, and Auto ARIMA on your time series data

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 4. Розділ 3

Запитати АІ

expand

Запитати АІ

ChatGPT

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

Suggested prompts:

Can you explain the difference between MAE and RMSE?

How do I decide when to use ARIMA versus SARIMA?

What is the significance of seasonality in time series forecasting?

Awesome!

Completion rate improved to 6.67

bookComparing ARIMA-Based Models

Свайпніть щоб показати меню

When you work with time series data, choosing the right forecasting model is crucial for achieving accurate predictions. As you have learned, ARIMA models are powerful tools for modeling non-seasonal time series, while SARIMA extends ARIMA's capabilities to handle seasonality. Additionally, Auto ARIMA automates the process of parameter selection. To determine which model works best for your dataset, you need to compare them using systematic strategies.

A common approach is to use cross-validation, where you repeatedly split your time series into training and testing sets, fit models to the training data, and evaluate their performance on the test data. However, because time series data is ordered, you must use techniques like rolling-origin or expanding window validation, which respect the temporal order.

Another essential strategy is metric-based selection. Here, you fit candidate models—such as ARIMA, SARIMA, and Auto ARIMA—to your training data, generate forecasts, and then compare their accuracy using quantitative metrics. The most widely used metrics include Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). Lower values of these metrics indicate better forecasting performance. Comparing these values across models helps you select the one that generalizes best to unseen data.

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647
import warnings import pandas as pd import numpy as np from statsmodels.tsa.arima.model import ARIMA from statsmodels.tsa.statespace.sarimax import SARIMAX from sklearn.metrics import mean_absolute_error, mean_squared_error # Suppress optimization logs and warnings warnings.filterwarnings("ignore") # Generate synthetic monthly data with seasonality np.random.seed(42) periods = 60 time = np.arange(periods) seasonal = 10 + 3 * np.sin(2 * np.pi * time / 12) trend = 0.3 * time noise = np.random.normal(scale=2, size=periods) data = seasonal + trend + noise ts = pd.Series(data, index=pd.date_range("2020-01-01", periods=periods, freq="M")) # Split into train and test sets train = ts[:48] test = ts[48:] # Fit ARIMA model (no seasonal order) arima_model = ARIMA(train, order=(2, 1, 2)).fit() arima_forecast = arima_model.forecast(steps=len(test)) # Fit SARIMA model (with seasonal order) sarima_model = SARIMAX(train, order=(2, 1, 2), seasonal_order=(1, 1, 1, 12)).fit(disp=False) sarima_forecast = sarima_model.forecast(steps=len(test)) # Evaluate forecasts arima_mae = mean_absolute_error(test, arima_forecast) arima_rmse = np.sqrt(mean_squared_error(test, arima_forecast)) sarima_mae = mean_absolute_error(test, sarima_forecast) sarima_rmse = np.sqrt(mean_squared_error(test, sarima_forecast)) # Compare visually import matplotlib.pyplot as plt plt.figure(figsize=(10, 5)) plt.plot(ts, label="Actual") plt.plot(test.index, arima_forecast, label=f"ARIMA Forecast (MAE: {arima_mae:.2f} RMSE: {arima_rmse:.2f})", color="orange") plt.plot(test.index, sarima_forecast, label=f"SARIMA Forecast (MAE: {sarima_mae:.2f} RMSE: {sarima_rmse:.2f})", color="green") plt.legend() plt.title("ARIMA vs SARIMA Forecast Comparison") plt.show()
copy

After running the comparison, you can interpret the results by looking at the MAE and RMSE values for each model. The model with the lowest error metrics is generally preferred, especially if the difference is substantial. However, you should also consider the complexity of the model and whether it captures the underlying structure of the data, such as seasonality. If SARIMA achieves lower errors than ARIMA on data with seasonal patterns, it suggests that modeling seasonality improved forecast accuracy.

question mark

Which criterion should you primarily use to select the best ARIMA-based forecasting model when comparing ARIMA, SARIMA, and Auto ARIMA on your time series data

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 4. Розділ 3
some-alt