Handling Missing Values in Temporal Features
メニューを表示するにはスワイプしてください
When you create lag or rolling window features from time series data, missing values often appear at the beginning of your data. This happens because lag features shift the original series backward, leaving the first few rows without data to fill those new columns. Similarly, rolling statistics like moving averages need a certain number of prior data points to compute a value, so the earliest rows in your dataset will have missing values for these features. These gaps are a natural result of the way temporal features are constructed.
1234567891011121314151617181920212223242526import pandas as pd # Example time series data data = { "date": pd.date_range(start="2023-01-01", periods=6, freq="D"), "value": [10, 12, 13, 15, 14, 16] } df = pd.DataFrame(data) df.set_index("date", inplace=True) # Create a lag feature and a 3-day rolling mean df["lag_1"] = df["value"].shift(1) df["rolling_3"] = df["value"].rolling(window=3).mean() # Handling missing values: Option 1 - Drop rows with missing values dropped_df = df.dropna() # Handling missing values: Option 2 - Fill missing values with a constant (e.g., 0) filled_df = df.fillna(0) print("Original DataFrame with missing values:") print(df) print("\nAfter dropping missing values:") print(dropped_df) print("\nAfter filling missing values with 0:") print(filled_df)
すべて明確でしたか?
フィードバックありがとうございます!
セクション 1. 章 7
AIに質問する
AIに質問する
何でも質問するか、提案された質問の1つを試してチャットを始めてください
セクション 1. 章 7