Resurrecting an old thread. I was going to post this on 'Probability 101' but that's not an Elite thread, so I opted for this one instead.
For those who can't be bothered to read the lenghty text, go to the TL;DR section
After reading the excellent article Kevin wrote here (, I remembered I had been wanting to put together a prototype that provides some perspective about risk, reward and reliability.
When I say 'reliability' I mean this: assuming that one has a found an edge that provides a certain winning percentage with a fixed risk/reward ratio, the natural tendency is first to do the math and work out what the hypothetical PnL is. I think we're all guilty of having done it at least once :).
But one thing that mostly tends to get overlooked is, even with a relatively good win% and R/R fixed target combination, the results won't be linear. Because the distribution of winners and losers is random, one could even end up losing money, especially on a reduced sample size.
I was particularly interested in what the deviation from the teorethical win % rate might look like. What I expected was, the greater the sample size, the more accurate the results (i.e., the closer the trade distribution gets to the theoretical win % rate). What I didn't expect was that even with a sample size I would consider reasonable, the deviations can be significant. For example, with a sample size of 50 trades there can be a 20% difference between the theoretical win rate and the trade sample results!
I think this puts things in a new light, when tempted to quickly work out the average trade value and then go "I have this win rate, if I stick to this R:R ratio, I will be rich!" (we'll call that the 'Utopia scenario').
So last week I put together an Excel model which is the one that provided the insight above. The model has 10 sample sizes ranging from 10 to 1000 trades.
File structure
When possible I tend to avoid macros/VBA, so I thought I'd try something different: the use of somewhat obscure Excel features such as circular references and iterations. Basically whenever new values are inserted into the spreadsheet, you need to press F9 to calculate the results. This is needed for two reasons: (1) to calculate the iteration at hand (2) to add the iteration at hand to the 'best case' or 'worst case' series.
Because this file uses circular references (cells with self-referential formulae) you need to have the file opened on its own, i.e. no other Excel files should be open. If you have other Excel files open you may get a "Circular reference" error, or the other files may stop calculating automatically while they are open.
Once the input parameters have been entered/adjusted, by pressing F9 the model will calculate various metrics such as % win rate, Max consecutive wins/losses, PnL, expectancy and average trade value. Note that for expectancy I used Kevin's definition from the article ( kevinkdog, I interpreted the (Reward:Risk) portion of the formula as (Reward/Risk), but if that is incorrect please let me know). The average trade value is the one that most people refer to as expectancy. These values can be found in the "This iteration" section.
Every time F9 is pressed, each trade sample size is recalculated, so a new set of values are shown for the "This iteration" section.
When a value is either smaller or greater than the value recorded in the "All iterations" section, it will be stored there. This is useful to compare the best and worst case scenarios for metrics such as %win rate and PnL.
By holding F9 down multiple iterations take place, and results are calculated more quickly. After a few seconds, the max deviation starts to flatten (meaning, there will be fewer iterations outside the deviation), however there will be still room for more deviation, only it will take place more and more sporadically. I tend to stop after having held F9 down for 20 seconds but the model is built with discretion …