From If-Else to XGBoost: Why the Hard Part…

K. Iyer

May 12

Machine Learning Series Part 1: We Tested 20 ML Features. Only 7 Beat the Noise Floor.

Read →

4 Comments

Luciano

Very good post! Thanks for sharing!

Joshua Joel Cuevas

Hey sir this was a great read! I was wondering why you chose k=5 for IC? And also why you didn't discuss entropy/conditional entropy? Thanks for sharing I really enjoyed the IC decaying graph, I'll read post 80 asap.

Reply (1)

K. Iyer

2dEdited

Hey thanks, glad the IC decay chart landed — it’s one of my favorites in the series because it connects the ML work back to the strategy decay framework from Post 80.

On the 5-day horizon for IC: it’s not a deep theoretical choice — it matches V6’s rebalancing cadence. V6 makes allocation decisions that play out over roughly 1-5 trading days, so I want features that predict returns at that horizon. IC at 1-day is dominated by microstructure noise (bid-ask bounce, overnight gaps). IC at 21-day captures too many confounding regime changes between the signal and the outcome. It becomes unwieldy pretty quick. 5-day is the sweet spot where the signal-to-noise ratio is highest for the kind of momentum/regime signals my V6 uses.

That said, you should always test multiple horizons. A feature with IC = 0.08 at 5 days and IC = 0.02 at 21 days is telling you something different than one with IC = 0.04 at both. The first is a short-term timing signal; the second is a slower structural indicator. Both are useful, but for different layers of the strategy.

On entropy and conditional entropy (mutual information): you’re right that I didn’t cover it, and it’s a fair gap. MI has a real advantage over IC — it captures non-linear relationships that rank correlation misses. A feature with zero IC but high MI would mean “this feature predicts returns, but not monotonically” — which is exactly the kind of relationship tree models can exploit.

Anyway all that to say the reason I defaulted to IC is because it’s simpler to interpret, more stable with small samples, and for the features in this series (z-scores, regime binaries, rates of change), the relationships with returns are approximately monotonic anyway. MI shines when you have features with U-shaped or threshold-based relationships — which we’ll actually encounter in Part 96 when the tree model discovers non-linear splits that IC would miss entirely. So stay tuned!!

Reply (1)

Joshua Joel Cuevas

Very cool! I'll look forward to it.

Math & Markets

From If-Else to XGBoost: Why the Hard Part…