Lernen Feature Engineering for Financial Data | Machine Learning for FinTech

Python for FinTech

Swipe um das Menü anzuzeigen

Feature engineering is the process of transforming raw data into meaningful inputs that enhance the performance of machine learning models. In financial applications, effective feature engineering can greatly improve the accuracy of predictions, such as forecasting stock prices, assessing credit risk, or detecting fraud. By extracting relevant information from financial data—like price histories or transaction records—you can provide your models with the context they need to make better decisions. In finance, this often involves creating features that capture trends, volatility, or relationships between different assets.


              12345678910111213141516171819
            
import pandas as pd

# Example price data for a financial asset
prices = pd.Series([100, 102, 101, 105, 107, 110, 108, 112, 115, 117])

# Calculate a 3-day moving average
moving_average_3 = prices.rolling(window=3).mean()

# Calculate volatility as the rolling standard deviation over 3 days
volatility_3 = prices.rolling(window=3).std()

# Combine into a DataFrame for clarity
features = pd.DataFrame({
    "Price": prices,
    "MA_3": moving_average_3,
    "Volatility_3": volatility_3
})

print(features)

When working with financial data, not all features are equally useful. Including irrelevant or redundant features can confuse your model, leading to overfitting or poor generalization. Feature selection is the process of identifying which features actually help your model make accurate predictions. In financial modeling, this might mean choosing between technical indicators, price lags, or external variables. Ignoring this step can result in models that are unnecessarily complex or even less accurate than simpler alternatives.


              1234567891011121314151617181920
            
from sklearn.feature_selection import SelectKBest, f_regression
import numpy as np

# Hardcoded example features: [moving average, volatility, price momentum, random noise]
X = np.array([
    [101, 1.2, 0.5, 0.9],
    [103, 1.5, 1.0, 0.2],
    [102, 1.1, -0.3, 1.1],
    [106, 1.8, 0.8, 0.5],
    [108, 2.0, 1.2, 0.7]
])
# Target variable: next day price
y = np.array([103, 104, 105, 107, 110])

# Select top 2 features most correlated with target
selector = SelectKBest(score_func=f_regression, k=2)
X_selected = selector.fit_transform(X, y)

print("Selected features shape:", X_selected.shape)
print("Feature scores:", selector.scores_)

War alles klar?

Danke für Ihr Feedback!

Abschnitt 3. Kapitel 2

Fragen Sie AI

Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen

Abschnitt 3. Kapitel 2

Feature Engineering for Financial Data

1. What is feature engineering?

2. How can moving averages serve as features in financial models?

3. Why is it important to avoid irrelevant features in a model?