Monte Carlo Simulation for Trading Systems: Testing Whether Your Edge Survives the Bad Path
Overview #
You backtested your strategy. It looks profitable. Congratulations — that tells you approximately nothing about whether you'll survive trading it live.
A backtest gives you one path through history. One sequence of wins and losses. One drawdown profile. One ending equity. But that sequence is just one of millions of possible orderings. What if your biggest winners had come three months later? What if you'd hit six losers in a row during your first week instead of your third month? Would you still be trading, or would you have blown through your risk limits and quit?
Monte Carlo simulation answers these questions by generating thousands of alternative equity curves from the same set of trade results. It doesn't predict the future. It shows you the range of possible outcomes your strategy could produce — and more importantly, it shows you how bad the bad path can get.
For futures traders, this matters more than it does for stock traders. Leverage amplifies everything. A 15% drawdown on a fully margined ES position happens fast and hits hard. Monte Carlo is the tool that tells you whether your position sizing can survive that drawdown before it shows up in your live account.
Key Concepts #
Monte Carlo Simulation: A statistical method that generates many randomized versions of a strategy's performance to estimate the probability distribution of outcomes. In trading, this means creating thousands of possible equity curves by resampling or reshuffling your actual trade results.
Path Risk: The risk that the specific sequence of wins and losses you experience produces a worse outcome than the average. Two traders with identical systems and identical trade sets can have wildly different drawdown experiences depending on the order in which those trades occur.
Trade Resampling: Taking your historical trade results and randomly reordering them (or sampling from them with replacement) to create synthetic equity curves. Each resampling produces a different path through the same underlying trade distribution.
Drawdown Distribution: Rather than looking at a single maximum drawdown from your backtest, Monte Carlo gives you the probability distribution of maximum drawdowns — the 50th percentile, the 80th, the 90th, the 95th. This is where position sizing decisions should come from.
Risk of Ruin: The probability that your account hits a drawdown threshold where you're forced to stop trading — whether by margin call, psychological breaking point, or hitting your predefined maximum loss. Monte Carlo quantifies this probability under different position sizing assumptions.
Confidence Band: The range of equity curve outcomes that contains a specified percentage of simulations. A 90% confidence band shows the corridor that 90% of your simulated equity curves fall within.
What Monte Carlo Actually Does (And What It Doesn't) #
Here's the thing. Monte Carlo is a risk-framing tool, not a crystal ball. It takes your trade data and explores what could have happened under different sequences. The output is a probability distribution — "there's a 35% chance you'll see a 15% drawdown, a 10% chance of 25%, and a 2% chance of 40%."
([Monte Carlo Simulation thread] [1]). The simulation strips away that luck-dependent confidence and replaces it with probability ranges.
What Monte Carlo proves:
- How sensitive your results are to trade ordering
- The range of drawdowns you should prepare for
- Whether your position sizing is too aggressive for the strategy's characteristics
- The probability of hitting specific loss thresholds
What Monte Carlo doesn't prove:
- That your strategy has an edge (garbage in, garbage out)
- That future markets will behave like the backtest period
- That your execution assumptions are realistic
- That the strategy will work in different market regimes
If your backtest is curve-fit or your execution model ignores slippage, Monte Carlo will just give you a beautifully precise distribution of meaningless numbers.
The Three Core Methods #
Method 1: Trade Shuffle (Permutation) #
The simplest approach. Take your actual list of trades and randomly reorder them. Run the equity curve with the new order. Repeat 5,000-10,000 times.
([Ninja Trader Monte Carlo] [2]).
The logic is straightforward. Your strategy produced a specific set of outcomes — 120 winners and 80 losers, with specific dollar amounts for each. The order they occurred in was basically random from a statistical perspective. Reshuffling reveals the range of paths those same outcomes could have taken.
When to use it: Quick sequence-risk check. You trust the trade distribution and want to see how much drawdown variation comes purely from ordering.
Limitation: Assumes trades are independent. If your strategy has serial correlation (wins cluster, losses cluster), shuffling destroys that structure and can understate the actual risk.
Method 2: Bootstrap Resampling #
Instead of just reshuffling, you sample trades with replacement. This means the same trade can appear multiple times in a single simulated sequence, while others may not appear at all.
This creates more variation than simple shuffling because it doesn't force every simulation to use the exact same set of outcomes. A bootstrap simulation might produce a sequence that's heavier on losses than your actual results, or lighter.
([KJ Trading AMA] [4]).
Block bootstrapping is a more sophisticated variant that preserves some of the time-series structure. Instead of resampling individual trades, you resample blocks of consecutive trades. This keeps local patterns (like volatility clustering or winning/losing streaks) intact while still randomizing the overall sequence. For futures, where returns exhibit clear autocorrelation, block bootstraps are usually more appropriate than simple shuffles.
When to use it: When you want a broader range of outcomes than simple shuffling provides, especially with strategies that have enough trades to make sampling meaningful.
Method 3: Parametric Simulation #
Instead of resampling actual trades, you fit a statistical model to your trade distribution and then generate synthetic trades from that model.
Common approaches:
- Normal distribution: Assume trades follow a bell curve with your strategy's mean and standard deviation
- t-distribution: Accounts for fat tails (more extreme outliers than normal distribution predicts)
- Regime-switching models: Different distributions for different market states (trending vs. choppy, low vol vs. high vol)
When to use it: When you want to stress-test beyond your historical data. A parametric model can generate scenarios your backtest never encountered — like a 4-sigma loss day that didn't happen to occur in your test period but could absolutely happen live.
Limitation: Model risk. If you assume normal distributions but your actual returns have fat tails, you'll underestimate tail risk — a point Nassim Taleb drives home in his technical monograph on fat-tailed distributions, demonstrating that conventional statistical tools systematically underestimate the probability and severity of extreme market moves [10]. If you assume independence but your losses cluster, you'll miss the worst drawdown scenarios.
Position Sizing: Where Monte Carlo Earns Its Keep #
This is the highest-value application. Monte Carlo transforms position sizing from a guess into a quantified decision.
Here's the practical workflow:
- Run your backtest and collect the trade list — every trade's P&L with commissions and slippage included
- Choose a position sizing rule — fixed fractional (risk X% per trade), ATR-based, or volatility-targeted
- Run Monte Carlo at multiple sizing levels — 0.25%, 0.5%, 1.0%, 1.5%, 2.0% risk per trade
- Compare the 95th percentile maximum drawdown for each sizing level
- Pick the size that keeps worst-case drawdown within your tolerance
([Why 7% Thread] [3]).
A Concrete Example #
Say your ES futures strategy produced 200 trades over 12 months with a 55% win rate, average winner of $450, and average loser of $375. Your backtest shows a maximum drawdown of $4,200.
Run 10,000 Monte Carlo simulations at 1% risk per trade:
- Median max drawdown: $5,800
- 80th percentile: $7,400
- 90th percentile: $8,900
- 95th percentile: $10,200
That $4,200 backtest drawdown was just one path — and it was a good one. The 95th percentile tells you there's a 5% chance of seeing a $10,200 drawdown with identical trade results. If your account is $50,000, that's a 20% hit that most traders can't stomach.
Now run it at 0.5% risk:
- Median max drawdown: $3,100
- 95th percentile: $5,400
At half the risk, your worst-case drawdown dropped from $10,200 to $5,400. The compound return is lower, but you'll actually survive the bad path.
This is the core trade-off Monte Carlo helps you see: position sizing determines whether your edge compounds or your account dies during an inevitable rough stretch.
Drawdown and Ruin Probability #
For futures traders, survival isn't just about account size — it's about margin. A 25% drawdown doesn't just hurt psychologically. It can push you below maintenance margin, trigger forced liquidation, and turn a recoverable drawdown into a permanent loss.
Monte Carlo lets you quantify these scenarios:
- Probability of exceeding 10% drawdown: Common threshold for most retail traders
- Probability of exceeding 20% drawdown: Where most traders start making irrational decisions
- Probability of exceeding 30% drawdown: Margin call territory for many futures accounts
- Probability of margin call: Account falls below maintenance requirements
@Big Mike shared a practical Monte Carlo tool on NexusFi that estimates "the probabilities of risk of ruin, median max drawdown, median annual return for the first year of trading" ([Taking a Trading System Live] [7]).
Time-to-Recovery Matters Too #
Drawdown depth is only half the picture. Duration is the other half.
A $10,000 drawdown that recovers in 3 weeks is manageable. The same drawdown that takes 6 months to recover can destroy your confidence, cause you to abandon the system right before it turns around, and push you into revenge trading with a different strategy.
Monte Carlo can track both metrics:
- How deep does the drawdown get?
- How long does recovery take?
If your simulation shows that 20% of paths require more than 4 months to recover from their worst drawdown, that's information you need before going live. Can you sit through 4 months of underwater equity without changing anything?
Interpreting Results Without Fooling Yourself #
The biggest danger with Monte Carlo is false confidence. Here's how to avoid it.
Read the Distribution, Not the Mean #
An average Monte Carlo result of +15% annual return sounds great. But if the 10th percentile is -8% and the 90th percentile is +42%, you have a strategy with massive variance. The mean is not what you'll experience. The distribution is what you need to plan for.
Focus on Tail Risk Metrics #
The most actionable outputs:
- 95th percentile maximum drawdown — your realistic worst-case planning number
- Expected shortfall (CVaR) — the average loss when you're in the worst 5% of outcomes
- Ruin probability — the chance of hitting a drawdown you can't recover from
Don't improve for mean return. Improve for acceptable tail risk.
Your Assumptions Are Everything #
([Why 7% Thread] [6]).
The key assumptions that drive Monte Carlo output:
- Independence: Are your trades actually independent? If losses cluster in certain market regimes, simple shuffling understates risk.
- Stationarity: Is the trade distribution stable over time? If your strategy performs differently in trending vs. ranging markets, a single distribution masks this.
- Execution costs: Are slippage and commission assumptions realistic? In volatile markets, slippage can double or triple. If your simulation uses fixed slippage, you're underestimating the bad paths.
- Sample size: 50 trades is not enough for reliable Monte Carlo results. 200 is the minimum. 500+ gives real confidence.
Sensitivity Testing #
Don't just run one Monte Carlo and call it done. Run multiple scenarios:
- Base case: Your actual trade data
- Degraded case: Add 20% more slippage, reduce win rate by 2%
- Stress case: Double the largest historical losses, extend losing streaks
If your position sizing survives the degraded case with acceptable drawdowns, you've got a strong setup. If it only works in the base case, you're one bad week away from a margin call.
Six Mistakes Traders Make with Monte Carlo #
Mistake 1: Using Monte Carlo to Validate an Overfitted Strategy #
If you optimized 15 parameters on 2 years of data and your backtest has 87 trades, Monte Carlo will tell you the range of outcomes for those specific 87 trades. It won't tell you that the strategy is curve-fit. The simulation is only as honest as the input data.
Fix: Run Monte Carlo on out-of-sample data, not on the optimization set.
Mistake 2: Ignoring Execution Realism #
Fixed $2 slippage per trade works fine in calm markets. During a VIX spike, your market orders on ES might slip 2-3 ticks, not a quarter tick. Monte Carlo with constant slippage will dramatically understate the tail risk of a fast-market strategy.
Fix: Model slippage as a function of volatility. In high-vol regimes, double or triple your slippage assumptions.
Mistake 3: Thinking More Simulations Equals More Truth #
Running 100,000 paths instead of 10,000 reduces sampling error — the uncertainty from the random generation process itself. It does not reduce model error. If your assumptions are wrong, more paths just give you a more precise wrong answer.
Fix: Spend your effort on assumption quality (realistic execution, appropriate resampling method, sufficient trade count), not on path count. 5,000-10,000 paths is usually sufficient.
Mistake 4: Optimizing Position Size to Maximize Return #
The Kelly Criterion and related frameworks like Ralph Vince's Optimal f — which finds the fraction of capital to risk that maximizes ending wealth across a historical trade sequence [11] — tell you the theoretically optimal bet size for maximum compound growth. But the optimal fraction produces stomach-churning drawdowns. Most successful traders use half-Kelly or less.
Monte Carlo can tell you the drawdown profile at full Kelly. Almost nobody can handle it. Size to survive, not to maximize.
Mistake 5: Ignoring Regime Changes #
Your backtest covers 2019-2024. Markets in 2020 looked nothing like 2023. If your Monte Carlo treats all trades as coming from the same distribution, it's averaging across regimes and potentially masking regime-specific fragility.
Fix: Run separate Monte Carlo analyses for different market regimes (low vol, high vol, trending, choppy) and see how your sizing holds up in each.
Mistake 6: Using Monte Carlo as a Substitute for Out-of-Sample Testing #
Monte Carlo explores scenarios within your existing data. It does not replace walk-forward testing, which tests whether your strategy works on data it's never seen.
([Taking a Trading System Live thread] [8], quoting Big Mike's recommendation). The workflow is: backtest, Monte Carlo, walk-forward. Each step filters out a different type of overconfidence.
Practical Implementation #
The Spreadsheet Approach #
For education and quick checks, a spreadsheet works:
- List your trades in column A
- Use RAND() to generate random sort keys in column B
- Sort by column B to reshuffle trades
- Calculate cumulative equity in column C
- Track max drawdown in column D
- Repeat 1,000 times (use a macro or Data Table)
This gives you a visceral feel for how trade ordering affects your equity curve. It's not production-grade, but it teaches the concept better than any textbook.
The Python Approach #
For serious analysis, Python with NumPy and pandas handles the computation:
- Import your trade list as a pandas DataFrame
- Use np.random.choice() for bootstrap resampling
- Calculate equity curve and drawdown for each simulation
- Aggregate statistics across all paths
- Visualize with matplotlib (confidence bands, drawdown distributions)
The computation is fast — 10,000 simulations on 500 trades takes seconds on modern hardware.
Platform-Integrated Tools #
NinjaTrader, TradeStation, and several third-party tools offer built-in Monte Carlo analysis.
([Walk Forward Experiment thread] [9]).
Platform tools are convenient but often limited in customization. You can't easily add regime-dependent slippage or test multiple sizing rules in parallel. For most traders, the hybrid approach works best: use your platform for backtesting, export the trade list, and run Monte Carlo in Python for the deeper analysis.
What to Include in Your Simulation #
Must-have inputs:
- Trade-by-trade P&L (not aggregated daily returns)
- Commission per trade (round-turn)
- Realistic slippage per trade (vary by instrument and volatility)
- Position sizing rule (fixed fractional, ATR-based, or fixed contracts)
Nice-to-have inputs:
- Regime labels (high vol / low vol / trending / choppy)
- Execution timing (intraday vs. overnight holds)
- Margin requirements (especially for futures traders who need to stay above maintenance)
Building a Monte Carlo Workflow for Your Trading #
Step 1: Collect at least 200 trades from your backtested strategy with all costs included. Fewer trades means less reliable simulations.
Step 2: Choose your resampling method. For most futures traders, block bootstrap (blocks of 5-10 consecutive trades) is the best balance of realism and simplicity.
Step 3: Run simulations at 3-5 different position sizes. Compare the 95th percentile drawdown for each.
Step 4: Add execution stress testing. Re-run with 50% higher slippage and see how the drawdown distribution shifts.
Step 5: Set your size based on the worst-case you can handle. If the 95th percentile drawdown at 1% risk per trade is 25% and you can handle 15%, drop to 0.5% or 0.6%.
Step 6: Re-run quarterly. Market regimes change. Your trade distribution evolves. Monte Carlo is a living analysis, not a one-time exercise.
The Bottom Line #
Monte Carlo simulation does one thing exceptionally well: it strips away the false confidence that comes from a single backtest. Your strategy's historical equity curve is just one roll of the dice. Monte Carlo shows you all the other rolls — the good paths, the ugly paths, and the paths that would have killed your account.
For futures traders, where leverage turns modest drawdowns into account-threatening events, Monte Carlo is not optional. It's the difference between position sizing based on hope ("I'll probably never draw down more than 15%") and position sizing based on data ("there's a 5% chance of a 25% drawdown, and I've capitalized so").
The tool doesn't make your strategy better. It makes your expectations realistic. And realistic expectations are the foundation of every trading account that lasts longer than a year.
As @Fat Tails summarized after extensive Monte Carlo analysis on NexusFi: the simulation "can support a higher leverage with equal risk — in particular in a bad run situation" ([Why 7% Thread] [5]). That's the real value. Not maximizing returns, but finding the sizing level where your edge can compound without the bad path destroying your account first.
Knowledge Map
Go Deeper
Build on this knowledgeReferences This Article
Articles that build on this topicCitations
- — Monte Carlo Simulation (2010) 👍 3“Now that NinjaTrader 7 features a Monte Carlo simulator, I was wondering if the math majors in the forum could help me better understand the appropriate way to use it and incorporate it into my trading. As you probably know, I suck at math.”
- — Ninja Trader Monte Carlo (2011) 👍 7“If you have limited data available a Monte Carlo Simulation (fancy word) can be used to increase the reliability of your backtest. This is what the Monte-Carlo-Simulation does: Let us assume that your backtest includes 200 trades.”
- — Why 7% is the Difference between Failure and Success in Trading (2012) 👍 6“Luger: I think that we are talking about two different things here .... (a) the risk that the trading system is correctly represented by the sample (b) the risk derived from the variance of the sample How good is the sample? The sample trades are tho...”
- — KJ Trading Systems Kevin Davey - Ask Me Anything (AMA) (2017) 👍 2“Simple explanation: You run a backtest, and you get a sequence of trades, and from that you build an equity curve. From that equity curve, you know your return, your max drawdown, etc.”
- — Why 7% is the Difference between Failure and Success in Trading (2012) 👍 11“As far as my understanding goes, Anagami has presented Monte Carlo Simulations. The worst path on the chart allows for an estimation of the maximal drawdown.”
- — Why 7% is the Difference between Failure and Success in Trading (2012) 👍 3“I subscribe to what you have written. The Bernoulli distribution is a special case, which can be used to build a simple model.”
- — Taking a Trading System Live (2013) 👍 10“In the last post on Position Sizing, I determined that using fixed fractional sizing with X=ff=.175 was my best alternative.”
- — Taking a Trading System Live (2013) 👍 2“No doubt Kevin will have a much more detailed answer for you, he is much more organized, detailed/methodical with this than I am, so I would wait for his reply if I were you...”
- — Walk Forward Experiment (2012) 👍 10“Sure. The output of the NT Monte Carlo tool are CDFs - Cumulative Distribution Functions. Remember the bell curve (Normal or Gaussian distrinution)? If you sum up the Normal distribution from all the way left to all the way right, you get the CDF.”
