Skip to content

Latest commit

 

History

History
510 lines (388 loc) · 14.1 KB

File metadata and controls

510 lines (388 loc) · 14.1 KB

Backtesting Guide

Backtesting validates forecast accuracy by testing models on historical data. This guide covers rolling-origin backtests, performance metrics, and parameter optimization.

Related:


Key Concepts

What is Backtesting?

Backtesting answers: "How well would this forecast method have performed on past data?"

Instead of testing on the same data used for training (overfitting), backtesting:

  1. Picks historical "anchor" points
  2. At each anchor, generates a forecast using only data available at that time
  3. Compares the forecast to what actually happened
  4. Aggregates error metrics across all test points

Rolling-Origin Backtest

The standard backtesting approach in mtdata:

Timeline: [----history----][forecast horizon]
                          ^
                       anchor

Parameters:

  • steps: Number of anchor points to test
  • spacing: Bars between anchor points
  • horizon: How far ahead each forecast predicts

Example: steps=20, spacing=10, horizon=12 creates 20 test points, each 10 bars apart, each forecasting 12 bars ahead.


Quick Start

Compare Forecasting Methods

mtdata-cli forecast_backtest_run EURUSD --timeframe H1 --horizon 12 \
  --methods "theta sf_autoarima analog" --steps 20 --spacing 10

Single Method with Custom Parameters

mtdata-cli forecast_backtest_run EURUSD --timeframe H1 --horizon 12 \
  --methods theta --params "alpha=0.3" --steps 30

Volatility Backtest

mtdata-cli forecast_backtest_run EURUSD --timeframe H1 --horizon 12 \
  --quantity volatility --methods "ewma parkinson garch" --steps 20

Command Reference

mtdata-cli forecast_backtest_run <SYMBOL> [OPTIONS]

Core Parameters

Parameter Default Description
symbol (required) Trading symbol (e.g., EURUSD)
--timeframe H1 Candle timeframe
--horizon 12 Bars to forecast at each anchor
--steps 5 Number of test anchors
--spacing 20 Bars between anchors
--methods auto Space or comma-separated method names

Method Parameters

Parameter Description
--params Parameters applied to all methods (JSON or k=v)
--params-per-method Per-method parameters: {"theta": {"seasonality": 24}}

Example with per-method params:

mtdata-cli forecast_backtest_run EURUSD --horizon 12 \
  --methods "theta arima" \
  --params-per-method '{"theta": {"alpha": 0.3}, "arima": {"p": 2, "d": 1, "q": 2}}'

Quantity

Parameter Options Description
--quantity price, return, volatility What to forecast

Notes:

  • return uses log returns (ln(close_t / close_{t-1})), which is often more stationary than prices.
  • volatility backtests compare predicted volatility vs realized volatility; use volatility methods like ewma, garch, har_rv.

Examples:

# Forecast returns instead of prices
mtdata-cli forecast_backtest_run EURUSD --quantity return

# Backtest volatility methods
mtdata-cli forecast_backtest_run EURUSD --quantity volatility --methods "ewma garch"

Trade Simulation

Parameter Default Description
--slippage-bps 0.0 Transaction cost in basis points (1 bp = 0.01%)
--trade-threshold 0.0 Minimum expected return to trigger a trade

Example with trading costs:

# Simulate 2 bps slippage per side (4 bps round-trip)
mtdata-cli forecast_backtest_run EURUSD --horizon 12 --methods theta \
  --slippage-bps 2 --trade-threshold 0.0005

Preprocessing Options

Parameter Description
--denoise Denoising method (e.g., ema, kalman)
--denoise-params Denoising parameters
--features Feature engineering spec
--dimred-method Dimensionality reduction (e.g., pca)
--dimred-params Dim reduction parameters

Dimred methods supported by the forecasting pipeline: pca, tsne, selectkbest (requires scikit-learn).

Tip: for forecast_backtest_run, pass dimred params as JSON:

mtdata-cli forecast_backtest_run EURUSD --horizon 12 --methods mlf_lightgbm \
  --features '{"include":["close","volume"]}' \
  --dimred-method pca --dimred-params '{"n_components":5}'

Example with denoising:

mtdata-cli forecast_backtest_run EURUSD --horizon 12 --methods theta \       
  --denoise ema --denoise-params "alpha=0.2"

Understanding Output

Aggregate Metrics

{
  "results": {
    "theta": {
      "success": true,
      "avg_mae": 0.00142,
      "avg_rmse": 0.00186,
      "avg_directional_accuracy": 0.583,
      "win_rate": 0.625,
      "successful_tests": 20,
      "num_tests": 20
    }
  }
}
Metric Description Good Value
avg_mae Mean Absolute Error (average) Lower is better
avg_rmse Root Mean Squared Error (average) Lower is better
avg_directional_accuracy % of correct direction predictions > 0.55
win_rate % of profitable trades > 0.50
successful_tests Tests that completed without error = num_tests

Trading Performance Metrics

When slippage-bps or trade-threshold is set:

{
  "metrics": {
    "avg_return_per_trade": 0.00082,
    "win_rate": 0.625,
    "sharpe_ratio": 1.45,
    "max_drawdown": 0.034,
    "calmar_ratio": 2.12,
    "cumulative_return": 0.0164,
    "annual_return": 0.087,
    "num_trades": 20,
    "trades_per_year": 365
  }
}
Metric Description Good Value
sharpe_ratio Risk-adjusted return > 1.0
max_drawdown Largest peak-to-trough decline < 0.10 (10%)
calmar_ratio Annual return / max drawdown > 1.0
cumulative_return Total return over test period > 0
win_rate Fraction of profitable trades > 0.50

Per-Anchor Details

Add --json to see individual test results:

{
  "details": [
    {
      "anchor": "2025-12-15 14:00",
      "success": true,
      "mae": 0.00128,
      "rmse": 0.00165,
      "directional_accuracy": 0.636,
      "forecast": [1.0542, 1.0545, ...],
      "actual": [1.0540, 1.0548, ...],
      "entry_price": 1.0538,
      "exit_price": 1.0552,
      "expected_return": 0.00094,
      "position": "long",
      "trade_return": 0.00133
    }
  ]
}

Method Comparison

Default Methods

If --methods is not specified, the backtest uses available classical methods:

  • naive, drift, seasonal_naive, theta, fourier_ols
  • Plus sf_autoarima, sf_theta if statsforecast is installed

Comparing Categories

Fast baselines:

mtdata-cli forecast_backtest_run EURUSD --horizon 12 \
  --methods "naive drift theta seasonal_naive" --steps 30

Statistical models:

mtdata-cli forecast_backtest_run EURUSD --horizon 12 \
  --methods "sf_autoarima sf_autoets sf_theta" --steps 30

ML models:

mtdata-cli forecast_backtest_run EURUSD --horizon 12 \
  --methods "mlf_lightgbm mlf_rf" --steps 20

Foundation models:

mtdata-cli forecast_backtest_run EURUSD --horizon 24 \
  --methods "chronos2 chronos_bolt" --steps 15

Parameter Optimization

Genetic Search (forecast_tune_genetic)

Automatically find optimal parameters for a forecasting method:

mtdata-cli forecast_tune_genetic EURUSD --timeframe H1 --method theta \
  --horizon 12 --steps 20 --spacing 10 \
  --metric avg_rmse --mode min \
  --population 20 --generations 10

Genetic Parameters

Parameter Default Description
--method (required) Method to optimize
--metric avg_rmse Metric to optimize
--mode min min to minimize, max to maximize
--population 12 Population size per generation
--generations 10 Number of generations
--crossover-rate 0.6 Probability of crossover
--mutation-rate 0.3 Probability of mutation
--seed None Random seed for reproducibility

Available Metrics

Metric Mode Description
avg_mae min Minimize mean absolute error
avg_rmse min Minimize root mean squared error
avg_directional_accuracy max Maximize direction accuracy
win_rate max Maximize profitable trades
sharpe_ratio max Maximize risk-adjusted return
calmar_ratio max Maximize return/drawdown ratio

Custom Search Space

Define which parameters to search:

mtdata-cli forecast_tune_genetic EURUSD --method theta \
  --search-space '{"seasonality": {"type": "int", "min": 12, "max": 48}}'

Search space format:

{
  "param_name": {
    "type": "int" | "float" | "categorical",
    "min": 0,
    "max": 100,
    "log": false,          // For float: use log scale
    "choices": [...]       // For categorical
  }
}

Default Search Spaces

Each method has sensible defaults. Examples:

Method Parameters Searched
theta alpha (0.05-0.5)
arima p (0-3), d (0-2), q (0-3)
fourier_ols m (8-96), K (1-6), trend (true/false)
sf_autoarima seasonality, stepwise, d, D
mlf_lightgbm n_estimators, learning_rate, num_leaves, max_depth

Practical Examples

Example 1: Find Best Method for Scalping

# Short horizon, tight spacing
mtdata-cli forecast_backtest_run EURUSD --timeframe M5 --horizon 6 \
  --methods "naive theta fourier_ols sf_autoarima" \
  --steps 50 --spacing 12 \
  --slippage-bps 1 --trade-threshold 0.0003

What to look for:

  • Highest win_rate with positive avg_trade_return
  • Low max_drawdown
  • sharpe_ratio > 1.0

Example 2: Optimize Theta for Swing Trading

# Step 1: Find optimal alpha
mtdata-cli forecast_tune_genetic EURUSD --timeframe H4 --method theta \
  --horizon 48 --steps 30 --spacing 24 \
  --metric sharpe_ratio --mode max \
  --population 20 --generations 15

# Step 2: Backtest with optimal params
mtdata-cli forecast_backtest_run EURUSD --timeframe H4 --horizon 48 \
  --methods theta --params "alpha=0.25" \
  --steps 50 --slippage-bps 2

Example 3: Compare Volatility Methods

mtdata-cli forecast_backtest_run EURUSD --timeframe H1 --horizon 12 \
  --quantity volatility \
  --methods "ewma parkinson garch har_rv" \
  --steps 30 --spacing 24

Output interpretation:

  • forecast_sigma: Predicted volatility
  • realized_sigma: Actual volatility that occurred
  • mae: Error between forecast and realized

Example 4: Robust Testing with Denoising

# Test if denoising improves accuracy
mtdata-cli forecast_backtest_run EURUSD --horizon 12 --methods theta \
  --steps 30 --denoise ema --denoise-params "alpha=0.3"

# Compare to non-denoised
mtdata-cli forecast_backtest_run EURUSD --horizon 12 --methods theta \
  --steps 30

Example 5: Walk-Forward Optimization

Simulate real-world model updates:

# Period 1: Optimize on first 6 months
mtdata-cli forecast_tune_genetic EURUSD --method theta --horizon 12 \
  --steps 50 --spacing 24 --metric avg_rmse

# Record best params, then test on next 3 months with those params
mtdata-cli forecast_backtest_run EURUSD --horizon 12 --methods theta \
  --params "seasonality=24" --steps 30 --spacing 24

# Repeat: re-optimize, test out-of-sample

Interpreting Results

Good Results Checklist

avg_rmse is small relative to price volatility
avg_directional_accuracy > 0.55 (better than random)
win_rate > 0.50 with positive avg_trade_return
sharpe_ratio > 1.0
max_drawdown < 10-15%
✅ Results consistent across different spacing values

Warning Signs

⚠️ Very high accuracy on backtests but poor live results → overfitting
⚠️ successful_tests << num_tests → method fails frequently
⚠️ avg_rmse much larger than avg_mae → outlier errors
⚠️ max_drawdown > 20% → high risk
⚠️ Results vary wildly with small parameter changes → unstable

Avoiding Overfitting

  1. Use enough test points: steps ≥ 20 for statistical significance
  2. Test across timeframes: Method should work on H1, H4, D1
  3. Test across symbols: Don't optimize for a single pair
  4. Out-of-sample validation: Reserve recent data for final test
  5. Realistic costs: Include slippage-bps and trade-threshold

Performance Tips

Speed Optimization

  1. Reduce steps for initial screening:

    --steps 10 --spacing 30  # Quick check
    --steps 50 --spacing 10  # Full validation
  2. Use fast methods first:

    • naive, theta, seasonal_naive are instant
    • sf_autoarima, chronos2 are slower
  3. Limit genetic search:

    --population 15 --generations 8  # Quick
    --population 30 --generations 20 # Thorough

Parallelization

Run multiple backtests in parallel (different terminals):

# Terminal 1
mtdata-cli forecast_backtest_run EURUSD --methods theta --steps 30

# Terminal 2
mtdata-cli forecast_backtest_run GBPUSD --methods theta --steps 30

Quick Reference

Task Command
Compare methods mtdata-cli forecast_backtest_run EURUSD --methods "theta arima analog" --steps 20
With trading costs --slippage-bps 2 --trade-threshold 0.0005
Volatility backtest --quantity volatility --methods "ewma garch"
With denoising --denoise ema --denoise-params "alpha=0.2"
Optimize params mtdata-cli forecast_tune_genetic EURUSD --method theta --metric avg_rmse
JSON output --json

See Also