Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
学ぶ Handling Missing Values in Temporal Features | Section
Engineering Temporal Features

bookHandling Missing Values in Temporal Features

メニューを表示するにはスワイプしてください

When you create lag or rolling window features from time series data, missing values often appear at the beginning of your data. This happens because lag features shift the original series backward, leaving the first few rows without data to fill those new columns. Similarly, rolling statistics like moving averages need a certain number of prior data points to compute a value, so the earliest rows in your dataset will have missing values for these features. These gaps are a natural result of the way temporal features are constructed.

1234567891011121314151617181920212223242526
import pandas as pd # Example time series data data = { "date": pd.date_range(start="2023-01-01", periods=6, freq="D"), "value": [10, 12, 13, 15, 14, 16] } df = pd.DataFrame(data) df.set_index("date", inplace=True) # Create a lag feature and a 3-day rolling mean df["lag_1"] = df["value"].shift(1) df["rolling_3"] = df["value"].rolling(window=3).mean() # Handling missing values: Option 1 - Drop rows with missing values dropped_df = df.dropna() # Handling missing values: Option 2 - Fill missing values with a constant (e.g., 0) filled_df = df.fillna(0) print("Original DataFrame with missing values:") print(df) print("\nAfter dropping missing values:") print(dropped_df) print("\nAfter filling missing values with 0:") print(filled_df)
copy
question mark

How does your choice between dropping rows with missing values and filling them with a constant (like 0 or the mean) affect the quality and interpretation of your temporal features?

正しい答えを選んでください

すべて明確でしたか?

どのように改善できますか?

フィードバックありがとうございます!

セクション 1.  7

AIに質問する

expand

AIに質問する

ChatGPT

何でも質問するか、提案された質問の1つを試してチャットを始めてください

セクション 1.  7
some-alt