Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Oppiskele Autoregression | Stationary Models
Time Series Analysis

book
Autoregression

Let's move on to the review of the autoregressive model:

The formula is similar to the linear regression formula, which is where the name comes from. Instead of the coefficient - the past value of x is used.

With statsmodels we can run an autoregressive model AutoReg():

python
from statsmodels.tsa.ar_model import AutoReg


# Train autoregression model
model = AutoReg(df["value"], lags=3)
model_fit = model.fit()


# Make predictions
predictions = model_fit.predict(start=0, end=len(X)-1, dynamic=False)


# Plot results
plt.plot(df["value"][:50])
plt.plot(predictions[:50], color='red')
plt.show()

If you notice, the predictions made by the autoregressive model are more accurate than those of the simple moving average.

Let's learn how to evaluate the received results of the trained models. The error is calculated using the mean-squared error. This is done simply with the help of functions sqrt() and mean_squared_error():

python
from sklearn.metrics import mean_squared_error
from math import sqrt

test_score = sqrt(mean_squared_error(df["value"][3:], predictions
[3:]))
print("Test MSE: %.3f" % test_score)

In the same way, we calculate the error value for the previous model:

The smaller the MSE value, the correspondingly smaller the error.

Tehtävä

Swipe to start coding

Create an autoregressive model and train it on the dataset shampoo.csv.

  1. Create an autoregression model (Autoreg) with 6 lags for the "Sales" column of the df DataFrame.
  2. Fit the model to data.
  3. Make predictions using the model. Start forecasting at the first row (the start parameter), and set the dynamic parameter to False.
  4. Visualize the results: show the first 150 observations of the "Sales" column of the df DataFrame within the first call of the .plot() function and the first 150 predicted values within the second call.

Ratkaisu

# Importing libraries
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.ar_model import AutoReg
from datetime import datetime

# Parser for the datetime
def parser(x):
return datetime.strptime("190"+x, "%Y-%m")

# Reading dataset
df = pd.read_csv("https://codefinity-content-media.s3.eu-west-1.amazonaws.com/943e906e-4de6-4694-a1df-313ceed7cfe7/shampoo.csv", parse_dates=[0], index_col=0, date_parser=parser)
df.index = pd.DatetimeIndex(df.index.values, freq=df.index.inferred_freq)

# Train autoregression with 6 lags
model = AutoReg(df["Sales"], lags=6)
model_fit = model.fit()

# Make predictions
predictions = model_fit.predict(start=0, end=len(df)-1, dynamic=False)

# Visualize the results
plt.plot(df["Sales"][:150])
plt.plot(predictions[:150], color="red")
plt.show()

Oliko kaikki selvää?

Miten voimme parantaa sitä?

Kiitos palautteestasi!

Osio 4. Luku 3
# Importing libraries
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.ar_model import AutoReg
from datetime import datetime

# Parser for the datetime
def parser(x):
return datetime.strptime("190"+x, "%Y-%m")

# Reading dataset
df = pd.read_csv("https://codefinity-content-media.s3.eu-west-1.amazonaws.com/943e906e-4de6-4694-a1df-313ceed7cfe7/shampoo.csv", parse_dates=[0], index_col=0, date_parser=parser)
df.index = pd.DatetimeIndex(df.index.values, freq=df.index.inferred_freq)

# Train autoregression with 6 lags
model = ___(df["___"], lags=___)
model_fit = model.___()

# Make predictions
predictions = model_fit.___(start=___, end=len(df)-1, dynamic=___)

# Visualize the results
plt.plot(___[:150])
plt.plot(___[:150], color="red")
plt.show()
toggle bottom row
We use cookies to make your experience better!
some-alt