Swipe to show menu

Convert Non-Stationary Data to Stationary

So, let's move on to the stage of processing non-stationary data. You have already seen predictive models that you can use to work with stationary data, but since most data is non-stationary, there are ways to convert it.

There are many types of transformations, such as difference, logarithmic transformation, proportional change, etc. But the main idea of mathematical transformations is to apply some function for each value of the time series to remove the time dependence (this includes trends and seasonality).

We will start with the differencing used in the ARIMA model. The principle is simple - the past value is subtracted from the current one:

This allows you to stabilize the value of the time series, making it more constant. Let's implement the difference transformation using Python:

dataset["diff_1"] = dataset["Open"].diff(periods=1).dropna()

Let's move on to the logarithmic transformation. If the difference allows us to equalize the mean, then the logarithmic transformation stabilizes the time series variance. The only limitation is that the logarithmic transformation can only work with positive values.

Below is the code for logarithmic transformation (log transformation):

import numpy as np

dataset["log"] = np.log(dataset["Open"])

Task

Swipe to start coding

Implement a difference transformation on the AirPassengers.csv dataset and output the average before and after for the #Passengers column.

Read the AirPassengers.csv file.
Drop the "Month" column out of the df DataFrame.
Calculate the mean value of the "#Passengers" column before changes.
Calculate the differences of each value of the column"#Passengers" compared with the previous (periods = 1`), drop NA values, and calculate the mean for the updated column.

Solution

Switch to desktop for real-world practiceContinue from where you are using one of the options below

Everything was clear?

Thanks for your feedback!

Section 5. Chapter 2

single

Ask AI

Ask anything or try one of the suggested questions to begin our chat

Convert Non-Stationary Data to Stationary

We will start with the differencing used in the ARIMA model. The principle is simple - the past value is subtracted from the current one:

This allows you to stabilize the value of the time series, making it more constant. Let's implement the difference transformation using Python:

dataset["diff_1"] = dataset["Open"].diff(periods=1).dropna()

Below is the code for logarithmic transformation (log transformation):

import numpy as np

dataset["log"] = np.log(dataset["Open"])

Task

Swipe to start coding

Implement a difference transformation on the AirPassengers.csv dataset and output the average before and after for the #Passengers column.

Read the AirPassengers.csv file.
Drop the "Month" column out of the df DataFrame.
Calculate the mean value of the "#Passengers" column before changes.
Calculate the differences of each value of the column"#Passengers" compared with the previous (periods = 1`), drop NA values, and calculate the mean for the updated column.

Solution

Switch to desktop for real-world practiceContinue from where you are using one of the options below

Everything was clear?

Thanks for your feedback!

Swipe to show menu

Convert Non-Stationary Data to Stationary

Solution

Awesome!

Convert Non-Stationary Data to Stationary

Solution

Awesome!