System Quality Number (SQN): Van Tharp's Formula for Measuring Trading System Quality
Overview #
Overview #
The System Quality Number (SQN) is Van Tharp's statistical framework for scoring the quality of any trading system. It combines three inputs — trade frequency, average R-multiple (expectancy), and consistency of results (standard deviation of R) — into a single score that tells you whether your system has a tradeable edge and how aggressively you can size positions. SQN is the primary tool in the Tharp Think framework for connecting system quality to position sizing strategy.
This article covers the SQN formula and its components, score interpretation benchmarks, step-by-step calculation, the 100-trade cap, comparison to Sharpe Ratio and Profit Factor, limitations, and practical application for system evaluation and position sizing. For R-multiple foundations, see R-Multiples. For expectancy as a standalone metric, see Expectancy.
What Is the System Quality Number? #
The System Quality Number (SQN) is a single metric that tells you whether a trading system has a genuine, statistically meaningful edge — or whether it just got lucky. Developed by Van K. Tharp, Ph.D., the SQN combines three fundamental dimensions of trading system performance into one score: how often you win (frequency), how consistent your results are (standard deviation), and how good your average trade is (expectancy). Get all three right and the SQN climbs. Have any one of them broken and the number exposes it.
The concept emerged from Tharp's work in the 1990s and was formalized in his book The Definitive Guide to Position Sizing Strategies. Tharp's insight was simple but powerful: profit factor alone doesn't tell you how tradeable a system is. A system with a 2.5 profit factor on 10 trades is not the same beast as a 2.5 profit factor on 500 trades. You need a metric that captures frequency, consistency, and quality simultaneously. The SQN does exactly that.
The deeper purpose of the SQN is not just to evaluate systems in isolation — it's to determine how easily a position sizing strategy can exploit the system's edge. A higher SQN means more room for aggressive position sizing. A lower SQN means you're fighting variance with every bet you place.
The Formula #
The SQN formula is a variation of the Student's t-test applied to trading results:
SQN = sqrt(N) × (Mean R / StdDev R)
Where:
- N = number of trades (capped at 100 for systems with more than 100 trades -- more on this below)
- Mean R = average R-multiple across all N trades
- StdDev R = standard deviation of R-multiples across all N trades
The core of the formula — Mean R divided by StdDev R — is the Sharpe Ratio of your R-multiples. Multiply that by the square root of your trade count and you get a metric that rewards systems that produce both quality results and a sufficient volume of trades to validate statistical significance.
Note what the formula does NOT use: dollars, percentages, account size, or trade duration. It works entirely in R-multiples, which makes it instrument-agnostic, timeframe-agnostic, and account-size-agnostic. A futures trader and a stock trader can compare SQN scores directly. That universality is one of its genuine strengths.
The R-Multiple Foundation
To calculate SQN, you need to express every trade result as an R-multiple first. R is your initial risk — the distance from entry to stop loss in dollar terms. A trade that makes 2x your initial risk is a +2R trade. A trade that stops out is a -1R trade. A trade that you exit at breakeven is 0R.
For more on R-multiples and how to calculate them, see R-Multiples: The Universal Currency of Trading Performance. The SQN article assumes you already know how to convert trade results into R-multiples. If you don't, start there.
The key point for SQN: express every trade as an R-multiple, then calculate the mean and standard deviation of that distribution. Those two numbers, combined with trade count, give you SQN.
Score Ranges and Interpretation #
Van Tharp established the following benchmarks for SQN interpretation:
| SQN Score | Rating | What It Means Practically |
|---|---|---|
| Below 1.6 | Poor | This system has no reliable edge. Don't trade it live. |
| 1.6 -- 1.9 | Below average, tradeable | There might be something here, but it will be a rough ride. Tight position sizing required. |
| 2.0 -- 2.4 | Average | A workable system. Many successful discretionary traders operate in this range. |
| 2.5 -- 2.9 | Good | You have a genuine edge with enough consistency to compound effectively. |
| 3.0 -- 5.0 | Excellent | Outstanding quality. Aggressive position sizing strategies become viable. |
| 5.0 -- 6.9 | Superb | Rare territory. High-frequency systems or extremely consistent strategies. |
| Above 7.0 | Too good to be true | Almost certainly curve-fitted. Verify on out-of-sample data immediately. |
The bottom line on these ranges: most live traders running discretionary futures systems land between 1.5 and 3.0. Algo systems that have been properly walk-forward tested often cluster between 2.0 and 4.0. Anything above 5.0 on live data is notable. Anything above 7.0 on backtest data is a red flag.
One critical nuance: the SQN score is sensitive to the number of trades used. Below 30 trades, the score lacks statistical meaning — the standard deviation estimate is too noisy. Tharp himself suggested 30 trades as the minimum for any meaningful SQN calculation, and many practitioners use 50 as a safer floor. Below 30, the number is basically noise dressed as precision.
Step-by-Step Calculation Example #
Let's work through a concrete example. You've taken 20 trades and recorded the following R-multiples:
+2.5, -1.0, +1.8, -1.0, +3.2, -1.0, +1.1, +0.8, -1.0, +2.0, -1.0, +1.5, -1.0, +0.5, +2.8, -1.0, +1.9, +3.5, -1.0, +1.4
Step 1: Calculate Mean R
Sum: 2.5 - 1.0 + 1.8 - 1.0 + 3.2 - 1.0 + 1.1 + 0.8 - 1.0 + 2.0 - 1.0 + 1.5 - 1.0 + 0.5 + 2.8 - 1.0 + 1.9 + 3.5 - 1.0 + 1.4 = 15.0
Mean R = 15.0 / 20 = 0.75R
Step 2: Calculate Standard Deviation of R
Calculate the variance: for each trade, subtract the mean (0.75) and square the result, then average those squared differences. The standard deviation of this distribution is approximately 1.41R.
Step 3: Apply the Formula
SQN = sqrt(20) x (0.75 / 1.41) = 4.47 x 0.532 = 2.38
A score of 2.38 puts this system in the "Average" range. It's tradeable, but you'd want more trades before drawing firm conclusions, and the high standard deviation (1.41R) relative to the mean (0.75R) tells you there's significant variability in outcomes that will make position sizing challenging.
Note that the 8 losing trades were all clean -1R stops (the system respects its stops consistently), while the winning trades varied from +0.5R to +3.5R. The wide distribution of winners is driving the high standard deviation. A system with more consistent winner sizes would have a lower StdDev R and a higher SQN for the same mean R.
Consistency: What Defines System Quality #
Two systems can have identical mean R-multiples and yet dramatically different SQN scores. The difference is consistency. System A produces steady +0.8R to +1.2R winners with occasional -1.0R stops — tight distribution, high SQN. System B has the same mean R but gets there with a few massive +5R outliers and a lot of scratches and small losers — wide distribution, low SQN.
The NexusFi community has discussed this property at length. @caprica put it clearly in an early thread on SQN optimization for NinjaTrader:
That's the insight: optimizing for SQN simultaneously maximizes expectancy AND minimizes variance. No other single metric does both at once.
The 100-Trade Cap: Why Van Tharp Introduced It #
One of the less-discussed aspects of SQN is the 100-trade cap. For systems with more than 100 trades, Tharp recommended using sqrt(100) = 10 instead of sqrt(actual trade count). This modifies the formula to:
SQN = 10 x (Mean R / StdDev R) (for systems with 100+ trades)
Why would you cap it? Because without the cap, a high-frequency system with 2,000 trades can produce an SQN of 45 even with a modest Mean R / StdDev R ratio of 0.15. That 45 looks amazing but tells you almost nothing useful about how good the system actually is compared to one with 200 trades and the same Mean R / StdDev R ratio (which would score ~2.1 without a cap).
The cap prevents frequency from dominating the score. @Barz explained the practical reason on NexusFi:
The cap is also defensible statistically: once you have 100 trades, the sqrt(N) factor has already accounted for statistical confidence. Additional trades continue to refine your Mean R and StdDev R estimates, but the sqrt multiplier has done its job. The key variables at that point are Mean R and StdDev R, which the cap preserves.
Practically, apply the cap consistently: if your system produces 200 trades per year, you're using 10 as your multiplier regardless. If it produces 40 trades per year, you're using sqrt(40) ~= 6.32. This is one reason SQN is more useful for comparing systems with similar trade frequencies — comparing a 50-trade system (multiplier 7.07) to a 150-trade system (multiplier 10) requires awareness that the multipliers differ.
@SMCJB raised the key question about this cap on NexusFi: "Why cap the number of trades at 100? Is a system that does 120 trades no better than 100? Why is a system that does 100 trades only 15% better than one that does 75, but one that does 75 is 22.5% better than one that does 50?" The cap is an acknowledged compromise — a practical solution to inflation, not a mathematically perfect one. Van Tharp reportedly acknowledged that he later revised some of his thinking on the exact cap value.
Using SQN to Monitor Edge Degradation #
One of SQN's most powerful real-world applications is rolling edge monitoring. Don't just compute SQN on your full trade history. Calculate it on a rolling 30-trade or 100-trade window. A system that's losing its edge will show a declining rolling SQN before the equity curve turns negative. This gives you an early warning system for regime changes or edge deterioration — often weeks before your account balance reflects the problem.
Set your alert threshold at 2.0: if your rolling SQN drops below 2.0, that's a yellow flag. If it drops below 1.6, you're in red flag territory and position sizing should be reduced immediately. Don't wait for a string of losses to tell you what the SQN already knew.
SQN and Market Type Classification #
One application of SQN that often surprises traders is its use in Van Tharp's Market Type classification system. Tharp categorized markets into six types based on trend direction and volatility: bull quiet, bull volatile, bear quiet, bear volatile, sideways quiet, and sideways volatile. The SQN of the market itself — calculated on a rolling basis using percentage daily returns rather than trade R-multiples — determines which type you're in.
The math is the same formula, applied to market returns instead of trade results:
Market SQN = sqrt(N) x (Mean Daily Return / StdDev Daily Return)
For a 25-day rolling window, N=25 and sqrt(25)=5. The resulting score determines direction (positive = bull, negative = bear, near zero = sideways) while volatility (the standard deviation component) determines whether the market is quiet or volatile.
The important practical use: trade your system's strengths. If you've backtested your system and found it produces SQN of 3.5 in bull quiet conditions but 0.8 in sideways volatile conditions, you have a clear filter for when to trade and when to sit on your hands.
The market SQN and your system SQN become complementary tools.
SQN vs. Sharpe Ratio #
The Sharpe ratio is the industry standard risk-adjusted return metric. SQN is the trader's alternative. Here's how they compare:
What they share: Both metrics divide a mean result by its standard deviation. The ratio Mean/StdDev appears in both formulas. In this sense, SQN = Sharpe x sqrt(N).
Where they differ:
- Input data: Sharpe uses time-period returns (daily, weekly, monthly) measured against a risk-free rate. SQN uses trade-level R-multiples with no risk-free rate adjustment. This makes Sharpe sensitive to trade frequency (a system with 10 trades per day will have very different period returns than one with 1 trade per month), while SQN is trade-frequency agnostic at the trade level.
- Symmetry: Standard Sharpe treats upside and downside volatility equally. A system that occasionally has a massive +10R winner will have a high standard deviation, which hurts its Sharpe even though large winners are good. SQN has the same symmetry problem. For this reason, some practitioners prefer the Sortino ratio (downside deviation only) for a more important risk adjustment.
- Statistical basis: The Sharpe ratio was designed for portfolio-level analysis. SQN is designed for trade-level system evaluation. Using Sharpe to compare trading systems is comparing tools across different problems.
NexusFi member @SMCJB laid this out clearly in a trading journal thread: "Sharpe Ratio = Average (Return - Risk Free Return) / Standard Deviation (Return - Risk Free Return). SQN = sqrt(#Trades) x mean / sd. They share the same Mean/StdDev core but differ in input and scale." Modern finance has shown that downside volatility matters more than total volatility, which is a valid criticism of both metrics.
The practical guidance: use Sharpe when comparing portfolios and funds, where time-period returns make sense. Use SQN when evaluating individual trading systems at the trade level, where R-multiples are more meaningful than time-period returns. They answer different questions.
SQN vs. Profit Factor #
Profit factor (gross profit divided by gross loss) is perhaps the most common quick-check metric for trading systems. It has one enormous advantage over SQN: it's dead simple to calculate and intuitively meaningful. A profit factor of 1.5 means for every $1 you lose, you make $1.50. Below 1.0 and you're losing money. Above 2.0 and you're doing well.
But profit factor hides critical information that SQN exposes:
Profit factor doesn't account for trade count. A profit factor of 2.0 across 15 trades is statistically meaningless. The same profit factor across 500 trades is strong. SQN's square root term explicitly rewards statistical significance — a system with 50 trades and a given Mean R / StdDev R scores higher than the same system with only 20 trades.
Profit factor doesn't account for consistency. Two systems can have identical profit factors but wildly different risk profiles. System A has consistent +2R winners and -1R losers. System B has occasional +20R home runs mixed with -1R losers and long streaks of scratches. Identical profit factors, but System A is far more tradeable. SQN captures this through the standard deviation term — System B's high variance lowers its SQN even if its Mean R is competitive.
Profit factor is scale-dependent. If you take a 10-trade subsample vs. a 200-trade full sample, profit factor can vary wildly. SQN is designed to be more strong to sample size differences through the sqrt(N) normalization.
For system comparison purposes, use profit factor as a quick initial screen, then SQN for deeper evaluation. A system that fails the SQN test but passes the profit factor test usually means you have either too few trades or too much variance in winners.
For a deeper dive on profit factor mechanics and its relationship to win rate and average win/loss ratios, see Profit Factor: The One Ratio That Tells You If Your Winners Are Big Enough.
SQN vs. Expectancy #
Expectancy is the average R-multiple per trade — the Mean R term in the SQN numerator. If your Mean R is 0.75R, your expectancy is 0.75R per trade. Expectancy tells you the theoretical value of each trade if you could play an infinite series of them at a fixed fraction of capital.
SQN extends expectancy in three important ways:
- Consistency adjustment: A Mean R of 0.75R with StdDev R of 0.50R is a much better system than Mean R of 0.75R with StdDev R of 2.0R. Expectancy treats both identically. SQN rewards the consistent system.
- Statistical significance filter: Expectancy on 15 trades is noise. Expectancy on 200 trades is signal. SQN's sqrt(N) term explicitly scales with confidence -- a system with more trades has earned a higher score for the same Mean R / StdDev R ratio.
- Position sizing relevance: Tharp's primary use for SQN was determining how aggressively you can position size. High SQN = tighter variance around expectation = safer to use larger position fractions. Low SQN = wide variance around expectation = must use smaller fractions or face ruin risk. Expectancy alone can't tell you this.
Think of it this way: expectancy tells you the direction of your edge. SQN tells you how reliably and quickly that edge will manifest at the trade level.
For the full treatment of expectancy as a standalone metric, see Expectancy: The Single Number That Tells You Whether Your Trading System Actually Works.
How SQN Score Guides Position Sizing #
The relationship between SQN and position sizing is direct: higher SQN allows larger position fractions per trade. This is the core practical application Tharp built SQN for — not just to rank systems, but to determine how aggressively to exploit each system's edge.
The rough guidance:
- SQN below 1.6: Don't trade it. If you must, keep position size at 0.1-0.25% equity per trade maximum.
- SQN 1.6-2.0: Conservative sizing only. 0.25-0.75% equity per trade. Variance will create painful drawdowns even though the edge is real.
- SQN 2.0-2.5: Moderate sizing. 0.5-1.5% equity per trade. This is where most professional discretionary traders operate.
- SQN 2.5-3.5: Good sizing. 1-2.5% equity per trade. The edge is consistent enough to compound meaningfully.
- SQN 3.5+: More aggressive sizing is viable if the system has sufficient live trade history. 2-4% equity per trade with proper Kelly-based sizing.
These aren't hard limits — they depend on your drawdown tolerance, account size, and risk of ruin calculations. But the pattern is consistent: every point of SQN improvement creates meaningful room for larger position fractions without proportional increase in ruin risk. The relationship is non-linear — going from SQN 2.0 to SQN 3.0 creates far more position sizing headroom than going from SQN 4.0 to SQN 5.0.
For more on the mechanics of translating system quality into position sizing decisions, see Position Sizing Methods for Futures Trading.
Limitations of the SQN #
No single metric captures everything. The SQN has well-documented weaknesses that practitioners need to understand:
1. Symmetry problem. Standard deviation treats upside and downside volatility equally. A system with occasional massive winners will have a higher standard deviation, which reduces SQN — even though those large winners are exactly what you want. This is the same criticism leveled at the Sharpe ratio. The Sortino ratio (using only downside deviation) or a modified SQN using semi-deviation would address this, but Tharp's standard formula doesn't go there.
2. No drawdown information. SQN tells you nothing about the path of returns. A system with SQN 3.0 might have a 40% maximum drawdown or a 5% maximum drawdown — the score doesn't distinguish. For position sizing purposes, you need to layer in Monte Carlo analysis and maximum adverse excursion data alongside SQN. See Monte Carlo Simulation for Trading Systems for how to stress-test the path.
3. Backward-looking only. Like all performance metrics, SQN describes what already happened. A 3.5 SQN on historical data doesn't guarantee future results. Walk-forward testing and out-of-sample validation are essential before concluding you have a live-tradeable system. The most insidious form of curve-fitting produces systems with excellent historical SQN that fall apart in live trading because they were over-optimized to past data.
4. Above 7.0, trust nothing. Any system with an SQN above 7.0 on backtested data should be treated with extreme skepticism. This score almost certainly reflects curve-fitting, look-ahead bias, or data-mining bias rather than genuine edge. Tharp himself noted this — a score of "keep this up and you may have the Holy Grail" was meant partly as a warning, not just a compliment.
5. The 100-trade normalization creates comparison challenges. Capping at 100 trades creates a discontinuity. A system with 99 trades uses sqrt(99) ~= 9.95 as its multiplier. A system with 101 trades uses sqrt(100) = 10. The difference is negligible there, but a system with 50 trades uses sqrt(50) ~= 7.07, and one with 100 uses 10 — a 41% difference in multiplier. Be aware of the relative multipliers when comparing.
6. R-multiple measurement requires discipline. SQN is only as good as your R-multiple tracking. If you're not disciplined about recording initial risk (your stop distance at entry) for every trade, your R-multiples will be noisy. Partial fills, stop moves, re-entries, and scaling trades all create measurement complexity. The formula is straightforward; the data discipline required to feed it correctly is not.
Backtest SQN vs. Live SQN: The Collapse Pattern #
The gap between backtested SQN and live SQN is one of the most reliable indicators of how much curve-fitting occurred during system development. Well-tested systems typically retain 70-80% of their backtested SQN in live trading. Heavily curve-fitted systems can collapse from a backtest SQN of 6+ to a live SQN below 1.0.
The pattern to watch for: any system with backtest SQN above 4.0 that hasn't been walk-forward validated across at least 3 distinct out-of-sample periods deserves serious skepticism. The math is simple — at a 100-parameter optimization, you have enough degrees of freedom to fit almost any random walk to an SQN above 4.0. That score tells you about the optimizer's ability, not the system's edge.
Practical workflow: build your system, run your backtest, record the SQN. Then paper trade or micro-position-trade for 50-100 live trades before scaling up. If your live SQN is within 0.5-0.8 points of your backtest SQN, the edge is real. If it's collapsed by more than 30-40%, the backtest was misleading.
Common Mistakes When Calculating SQN #
Using P&L dollars instead of R-multiples. SQN calculated on raw dollar P&L produces meaningless scores. Your initial risk varies per trade — a $200 winner on a trade where you risked $50 is a +4R trade. The same $200 on a trade where you risked $400 is a +0.5R trade. Treating both as "$200" makes the standard deviation calculation nonsense. Always convert to R-multiples first.
Ignoring the 100-trade cap for high-frequency systems. A scalper taking 30 trades per day for a year has 7,500+ trades. Using sqrt(7500) ~= 86.6 as the multiplier inflates SQN to absurd levels. Apply the cap. The comparison utility of SQN depends on it.
Not accounting for partial wins and losses. If you scale out of positions, each partial exit should be tracked as a separate trade for R-multiple purposes, or you need to calculate a weighted average R-multiple for the full trade. Mixing full-trade and partial-trade R-multiples creates inconsistent data that corrupts the standard deviation calculation.
Conflating backtest SQN with forward SQN. A 4.5 SQN on a backtest is a hypothesis, not a fact. Walk-forward test your system on out-of-sample data. Most systems lose 30-60% of their backtested SQN when exposed to data they weren't optimized on.
Use SQN as a position sizing multiplier, not just a pass/fail gate. A 3.0 SQN doesn't just tell you the system is good — it tells you how aggressive your position sizing can be. Every point of SQN improvement above 2.0 meaningfully expands your safe bet-size envelope.
Practical Application: Evaluating and Improving Systems with SQN #
Here's how to put SQN to work in a real system development and evaluation process:
Step 1: Establish your R-multiple ledger. Track every trade with entry price, stop price at entry, and exit price. Convert each trade to its R-multiple result. This is the foundation. Without it, nothing else is meaningful.
Step 2: Calculate rolling SQN. Don't just compute SQN on your full trade history. Calculate it on rolling 30-trade and 100-trade windows. A system that's degrading will show a declining rolling SQN before the equity curve turns negative.
Step 3: Segment by market condition. Calculate SQN separately for different market types (trending vs. ranging, high volatility vs. low volatility, morning session vs. afternoon session). This reveals which conditions your system actually thrives in and which conditions to sit out.
Step 4: Use SQN to guide position sizing decisions. High SQN (3.0+) systems can support more aggressive position sizing — larger fraction of equity per trade, faster scaling of size as equity grows. Lower SQN systems (under 2.0) require conservative sizing to survive variance.
Step 5: Diagnose weak SQN scores. When SQN comes back disappointing, look at the Mean R / StdDev R ratio separately. If Mean R is positive but StdDev R is huge, your system has an edge but wildly inconsistent execution or setup quality. If Mean R is near zero, your system has no edge at all. These diagnose different problems requiring different fixes.
Step 6: Set minimum SQN thresholds for live trading. Many practitioners won't trade a system live unless it's achieved at least a 2.0 SQN on 50+ live trades. This forces discipline about actually tracking results before scaling up.
Knowledge Map
Prerequisites
Understand these firstReferences This Article
Articles that build on this topicCitations
- — Van Tharp's SQN with over 100 trades (2019) 👍 2“With regards to SQN ask yourself two questions: Why cap the number of trades at 100? The whole point of the SQN is to compare one system to another over the same time period to determine which is better.”
- — Van Tharp's SQN with over 100 trades (2019) 👍 2“If you had 900 trades, you can get a great looking SQN for a not so great system. So, Tharpe says you should cap your number of trades at 100 so you don't get abnormally large SQN values.”
- — Van Tharp's SQN (system quality number) (2009) 👍 3“The smaller the Std dev (P&L), the more regular are your results and the smaller are the drawdowns. If you optimize for the largest SQN, you maximize in fact the product N*average P&L and you minimize the Std dev (P&L) and the drawdowns at the same time.”
- — Van Tharp's Max Expectancy (2009) 👍 2“SQN = SquareRoot(N) * (Avg Trade Result/Standard Deviation(Avg Trade Result)). This formula takes frequency (N), reliability (Standard Deviation of avg trade), and expectancy (Avg Trade) and produces an objective score of any system.”
- — Tharp Market Type Classification (2021) 👍 4“I like van Tharp's position sizing, and the market type calculation is the SQN value of that system. So in order to calculate you need a trading system already calculated and the expectations of the system.”
- — How advanced mathematics and gaming theory can help you as a trader (2011) 👍 3“SQN = E * SQRT(N) / (AL * SD) where E = mathematical expectation, AL = average loss or R, SD = standard deviation of trade results. Compare this to the formula of Van Tharp -- it uses similar input variables.”
- — iSystems Journal (2020) 👍 2“Sharpe Ratio = Average (Return - Risk Free Return) / Standard Deviation (Return - Risk Free Return). SQN = sqrt(#Trades) * mean / sd. They share the same Mean/StdDev core but differ in input and scale.”
- Van Tharp Institute — System Quality Number (SQN) -- Van Tharp Institute (2024)
- Van Tharp Institute — Definitive Guide to Position Sizing Strategies (2024)
- ExPostFacto Library — System Quality Number implementation reference (2024)
