A New SPY/VIX Long-or-Cash Model (Powered by Synthetic Regimes)

Post 21 below is my part of my ongoing attempts at using synthetic data to improve model performance

Nov 21, 2025

This is part 21 of my series — Building & Scaling Algorithmic Trading Strategies

I’ve been experimenting with a simple idea: Can a machine-learning model decide when to be long SPY vs. sitting in cash using nothing but SPY/VIX dynamics — and can synthetic data make it better?

To make sure I could test the model on “real” scenarios, I also intentionally only trained it on pre-2023 data.

Here’s the quick version of what I built, how it behaved, and what I learned.

(“I am programmed in multiple techniques, a broad spectrum of pleasuring.” — Cmdr. Data)

Building the First Version (and Why It Failed to Generalize)

The original prototype used:

SPY returns, volatility, and rolling correlations
VIX returns and volatility
a synthetic SPY series generated from bootstrap blocks
a binary label: “go long” if the next 5-day SPY return > 0, otherwise “go to cash”

It trained an XGBoost classifier purely on the synthetic slice (on purpose) and tested on real data.

Results (first attempt):

Synthetic ROC–AUC: 0.988
Synthetic accuracy: 0.939
Real-world ROC–AUC: 0.509
Real-world accuracy: 0.561
Long/cash toggle Sharpe: 0.80, Max DD –29%

Exactly what you’d expect from a model that memorized fake data: excellent synthetic performance, almost no real-world transfer.

So I asked looked into would take to fix that.

2. What the Model Actually Needed

The model only saw:

SPY/VIX price-derived features
one bootstrap-based synthetic SPY path
no richer macro context
no cross-asset structure
no realistic SPY–VIX regime variation

It was basically learning “fake SPY” and trying to apply it to real SPY. That doesn’t work.

The fix required three things:

(1) Better synthetic scenarios

Not just shuffled SPY blocks, but multi-factor synthetic regimes where:

SPY and VIX co-move differently
vol spikes aren’t always tied to crashes
inverse beta sometimes weakens
volatility can surge in sideways markets
stress comes in different shapes

In short: synthetic worlds that aren’t just re-skinned history.

(2) Better real-world features

Features that remain meaningful across both real and synthetic data:

rate slopes
credit spreads
macro vol indicators
skew and kurtosis
TLT / QQQ cross-moves
multi-horizon realized vol
SPY–VIX entropy measures

These are exactly the kinds of signals that showed up in the hybrid ensemble work.

(3) Proper validation

Train on:

synthetic scenarios plus older pre-2023 real history

Test on:

clean, untouched 2023+ real SPY/VIX behavior

This avoids leakage and tells me whether the model actually learned anything transferable.

3. The New Version: Multifactor Synthetic Worlds + Expanded Features

I rebuilt the whole SPY/VIX model pipeline with:

Multiple synthetic SPY variants

Each with different:

SPY–VIX betas
volatility regimes
shock frequencies
correlation structures

Enlarged feature set

SPY price-derived features (returns, rolling vol/corrs) and VIX levels/returns.

Unlike the hybrid ensemble, it doesn’t ingest cross-asset inputs like Treasury curves, credit spreads, or TLTFX signals.

Retraining process

Train on synthetic + pre-2023 real data
Hold out all 2023+ real history
Evaluate pure long/cash toggle using the model’s daily probability

4. Final Performance (2023+ real SPY test)

Buy-and-Hold SPY

ROI: 78.2%
CAGR: 22.5%
Sharpe: 1.38
Max DD: –19.0%

Synthetic-trained SPY/VIX Toggle (threshold 0.5)

ROI: 82.1%
CAGR: 23.5%
Sharpe: 1.89
Max DD: –10.6%

That’s surprisingly strong:

higher Sharpe
lower drawdown
slightly higher returns
and all from a simple long/cash rule

The model isn’t timing day-to-day volatility; it’s catching the big shifts early enough to avoid the worst drawdowns while staying long during calm regimes.

5. Takeaways

Synthetic worlds can meaningfully improve generalization if they’re built with proper multi-factor structure.
For this model just SPY/VIX was good enough, but I’ll probably need to add other cross-asset overlays to test more thoroughly.
The long/cash toggle is now good enough to live alongside the rest of my sleeves.
This is the first strategy where synthetic augmentation outperformed the historical-only version.

All that said, right now it’s simply a nice sandbox for stress-testing ideas, but it isn’t production-ready. So I’ll treat it as a research tool to iterate on.

Improve synthetic realism: feed it richer scenarios (vol/credit rate shocks) and add macro features so it learns signals that transfer to real data.
Use it as a toggle overlay pilot: run it alongside buy-and-hold/logistic thresholds and log the toggles so I can compare drawdown behavior over time before committing capital.
Leverage it for regime labeling: even if the model doesn’t outperform consistently, its probabilities can tag “risk-off” periods for other strategies (allocator, VIX sleeve) or for feature engineering/ensembles.

Either way, I now have a framework to test more strategies in synthetic universes — not just SPY/VIX, but dual momentum, the volatility sleeve, and even cross-asset allocators.

The information presented in Math & Markets is not financial advice and should not be construed as such.

Math & Markets

Discussion about this post

Ready for more?