Related courses

Beginner

Introduction to Python

Python is a high-level, interpreted, general-purpose programming language. Distinguished from languages such as HTML, CSS, and JavaScript, which are mainly utilized in web development, Python boasts versatility across multiple domains, including software development, data science, and back-end development. This course will guide you through Python's fundamental concepts, equipping you with the skills to create your own functions by the conclusion of the program.

python

4.7

course

Intermediate

Ultimate Visualization with Python

Data is everywhere around us, and making sense of it is extremely important. Visualization helps you deal with data by finding certain patterns and insights in it. You will develop a solid foundation of data visualization using Python and its libraries, such as matplotlib and seaborn, to get as much information from data as possible in a neat and concise way.

python

4.8

course

Intermediate

Ultimate NumPy

Unlock the full potential of Python's most essential library for numerical computing, NumPy. This comprehensive course is designed to take you from a beginner's understanding to an advanced level of proficiency in NumPy. Whether you're a data scientist, engineer, researcher, or developer, mastering NumPy is essential for efficient data manipulation, scientific computing, and machine learning.

python

4.7

Data ScienceProbability&Statistics

Mastering Stationarity in Time Series

The Key to Reliable Time Series Analysis

by Ruslan Shudra

Data Scientist

Jul, 2024・
6 min read

Introduction to Stationarity

Stationarity is a fundamental concept in time series analysis, referring to a statistical property where the mean, variance, and autocorrelation structure of the series remain constant over time. In simpler terms, a stationary time series is one whose statistical properties do not change when shifted in time.

Understanding stationarity is crucial because many time series forecasting methods, including ARIMA models, rely on the assumption that the underlying data is stationary. Non-stationary data can lead to unreliable and misleading results in analysis and forecasting. Therefore, identifying and ensuring stationarity is often a key step in time series analysis.

Run Code from Your Browser - No Installation Required

Types of Stationarity

Strong (Strict) Stationarity

A time series is considered strictly stationary if its statistical properties, such as mean, variance, and autocorrelation, are invariant under time shifts. This means that the probability distribution of the series remains the same throughout its entire length.

Weak (Second-Order) Stationarity

Weak stationarity, also known as second-order or covariance stationarity, requires that the mean and variance of the series are constant over time, and the covariance between two time points depends only on the time lag between them, not on the actual time at which the covariance is computed.

Trend and Seasonal Stationarity

Trend stationarity refers to a time series that has a deterministic trend, which means it can be removed to make the series stationary. Seasonal stationarity indicates that the series has seasonal variations that can be modeled and removed to achieve stationarity. These types often require specific transformations to render the data stationary.

Testing for Stationarity

Common Statistical Tests

Several statistical tests can help determine if a time series is stationary.

The Augmented Dickey-Fuller (ADF) test checks for a unit root in the data, where the null hypothesis is that the series has a unit root (is non-stationary).

The KPSS (Kwiatkowski-Phillips-Schmidt-Shin) test, on the other hand, tests for stationarity around a deterministic trend, with the null hypothesis being that the data is stationary.

Interpreting the results of these tests involves comparing the test statistic to critical values. For the ADF test, if the test statistic is less than the critical value, the null hypothesis of a unit root is rejected, suggesting the series is stationary. For the KPSS test, if the test statistic is greater than the critical value, the null hypothesis of stationarity is rejected, indicating the series is non-stationary.

import numpy as np
import pandas as pd
from statsmodels.tsa.stattools import adfuller, kpss


# Augmented Dickey-Fuller (ADF) Test
adf_result = adfuller(time_series)
print('ADF Statistic:', adf_result[0])
print('p-value:', adf_result[1])
print('Critical Values:')
for key, value in adf_result[4].items():
    print(f'{key}: {value}')

# KPSS Test
kpss_result = kpss(time_series, regression='c')
print('\nKPSS Statistic:', kpss_result[0])
print('p-value:', kpss_result[1])
print('Critical Values:')
for key, value in kpss_result[3].items():
    print(f'{key}: {value}')

Visual Inspection Methods

Visual inspection is a helpful preliminary step in assessing stationarity. Plotting the time series data and checking for consistent mean and variance over time can give a quick indication of stationarity. Autocorrelation plots (ACF) can also be used; a stationary series will have autocorrelations that die off quickly, while non-stationary series will show slow decay in autocorrelation.


import matplotlib.pyplot as plt
from statsmodels.graphics.tsaplots import plot_acf


# Plotting the time series and autocorrelation
plt.figure(figsize=(12, 6))
plt.subplot(121)
plt.plot(time_series)
plt.title('Time Series')
plt.subplot(122)
plot_acf(time_series, ax=plt.gca())
plt.title('Autocorrelation')
plt.show()

Start Learning Coding today and boost your Career Potential

Transforming Non-Stationary Data

Differencing

Differencing is a common technique to remove trends from a time series. It involves computing the differences between consecutive observations. This can be done using the diff function in pandas.

# Differencing example
differenced_series = time_series.diff().dropna()

Detrending and Deseasonalizing

Detrending removes a deterministic trend from the data, while deseasonalizing removes seasonal components. These transformations can be achieved using methods such as linear regression or seasonal decomposition.

# Detrending example (linear regression)
from sklearn.linear_model import LinearRegression

X = np.arange(len(time_series)).reshape(-1, 1)
model = LinearRegression()
model.fit(X, time_series)
trend = model.predict(X)
detrended_series = time_series - trend

# Deseasonalizing example (seasonal decomposition)
from statsmodels.tsa.seasonal import seasonal_decompose

decomposition = seasonal_decompose(time_series, model='additive')
deseasonalized_series = time_series - decomposition.seasonal

Log Transformations and Other Techniques

Log transformations are useful for stabilizing variance in time series with exponential growth. Other techniques, such as power transformations (sqrt, boxcox), can also be applied to stabilize variance or achieve stationarity.

# Log transformation example
log_transformed_series = np.log(time_series)

# Box-Cox transformation example
from scipy.stats import boxcox

boxcox_transformed_series, _ = boxcox(time_series)

These transformations demonstrate methods to convert non-stationary time series into stationary ones, enabling more reliable analysis and forecasting. Each code snippet illustrates a different transformation technique commonly used in practice.

FAQs

Q: How does stationarity impact time series analysis?
A: Stationarity ensures that statistical properties of the time series remain consistent over time, allowing for reliable forecasting and analysis using techniques like ARIMA and exponential smoothing.

Q: What are the consequences of analyzing non-stationary time series?
A: Analyzing non-stationary data can lead to unreliable statistical inferences and forecasts. Trends and seasonal patterns can obscure underlying patterns and lead to erroneous conclusions.

Q: How can I visually check for stationarity in a time series?
A: Plotting the time series data and observing trends, changes in variance, and autocorrelation patterns can provide initial insights into stationarity before applying formal statistical tests.

Q: Are there cases where stationarity is not necessary for time series analysis?
A: Yes, in some machine learning models like deep learning-based approaches, preprocessing steps such as differencing or normalization may suffice without requiring strict stationarity assumptions.

Was this article helpful?