Outside the Box and then some....

January 27th, 2018, 09:50 PM

iantg

Hi Pen15,

Thanks for reading this thread and thanks for asking. This is a great question. I have a pretty thorough process to my optimizations and I will likely post on this in more detail in the near future, but generally what you are trying to do with any type of optimization once you have metadata is isolate your variables to try to see if there is any sort of correlation that drives performance up or down. I will give you an idea pertaining to volatility to try. Test this and see what you think, but it should work for you. Using your metadata (Output print windows) you need to collect the following variables:

1. Trade Number
2. Volatility level: (There are a ton of ways to quantify this, and I provided some examples in prior posts, and there may even be some indicators for this, but really what you are trying to do is just get the numbers. 1-10, 3-7, (Low, - High) how ever you draw the line in the sand.
3. Profit Target (In terms of number of ticks
4. Stop loss in terms of number of ticks

Technically you don't have to even collect the last two you just have to keep these constant for the test. So don't change these at all, pick a setting, say for example a PT of 5 vs. an SL of 10, and don't touch it for the entire simulation.

Now for the testing: You will need to run at least 3 simulations over a period of time. If you can do this via a strategy then this will be ideal as you can collect more data much faster but if you are doing it discretionary then at least get a day per sample. But here is the general idea for the tests.

Test 1: Set a Tight PT with a High SL. Let's say 3 ticks of profit and 6 ticks of loss, or some other 1x2 ratio that seems appropriate for the instrument you are testing. Then run the simulation collect all your statistics and correlate the following.

A: For each trade # what was the volatility number: 1-10 for example
B: For each trade # what was the outcome. Did you win or lose?
C: We already know the PT and SL settings because these were constant through out the whole simulation.

Now here is the analysis, take your results in a spreadsheet format and put these into a pivot table and then aggregate the results by the volatility levels 1-10 and you will see the following:
1. At low levels of volatility 1-3 for example, was your net profit (All trades combined with volatility at this level) positive or negative, what was your average trade value for example.
2. At medium levels of volatility (4-6) for example what was your average trade value, and total net profit
3. At high levels of volatility (7-10 for example) what was your average trade value and total net profit

You will quickly find that setting a low PT against a high SL is very favorable in a low volatility type of trading environment, but typically performs poorly in a high volatility trading environment. In low volatility the market stays in a tight range, so by placing your stop loss just outside of this range, you will almost never hit it, so even though it may have 2x or 3x the value of your PT, you can beat the expectancy.

Next simulation: set your PT at 1.5 x or even 2x your SL. Lets say PT = 8 and SL = 5 (just throwing it out there).
Now run the exact same test and collect the statistics the same way. Here you will see the following.

In low volatility you hit your SL way more than your PT and lose.... Big time. But in times of high volatility, the odds of a big move occurring are equally as likely as a small move, so your # of winners may be close to even your number of losers but the value of your winner is 1.5 or 2X the value of your loser. You will crush it with this populations.

And you will run the same type of test for medium volatility levels, etc, and find the sweet spot there too.

I can't tell you how easy this is conceptually, but it takes time and is quite boring to do, and finding exactly where the lines are can be tedious and will take lots of repetition, but eventually these will be crystal clear and you will start crushing it more than you can ever believe.

The volatility measurement itself takes some work to optimize because you can use a 5 period, 10 period, 20 period or greater population to sample the data to find the absolute delta between the high and low, but depending on your time series this may be to tight or too lose. For example running a 5 period measurement on tick data of like 15 ticks would produce next to nothing, everything would show as low volatility because you are too zoomed in. By contrast if you back this up to a 50 period measurement over a 30 minute time period the volatility looks crazy high all the time. You need to find that sweet spot where you can see good delineation. So this is your first step.

Your second step is finding the right PT / SL settings to get good throughput in low volatility and high volatility within your chosen time frame. This will take a lot of tweaking.

Finally your third step, is to find the exact lines. It's easy to see low and high volatility but medium is way more subtle and there may be 2 or 3 levels of delineation here you may want to add within this population. Ultimately by testing over and over in different time frames, with different volatility settings and different lines in this sand here is what you are hoping to come up with.

1. For low volatility quantified by values of X or less, trade with a PT of Y against an SL of X. I told you generically what this is, but you will need to find the sweet spot with your given instrument.
2. For high volatility quantified by values of X or >, trade with a PT of Y against an SL of X. Here you want to shoot for home runs and eat more losers than winners but with lower value losers and higher value winners. Something in the 2x1 range or >.
3. Medium volatility: This is the trickiest part, because you will need to find the exact volatility level matched with the exact PT / SL setting to switch between low and medium and medium and high. Within medium you may want to bifurcate this further into medium low / medium high, etc. But generally you are trying to keep your PT / SL fairly even, so that as these play out you have 50%/ 50% winner / losers and the payout is equal. Truthfully there is a lot less of a chance to exploit these type of market conditions. The best play here is to set the bet line even and just wait for a change back to low V or high V where you have a definitive betting edge.

Anyway, I can cover more about the metadata and process you will need but this should help you get setup and started. Good luck. I hope you crush it!

Ian

Ian,

Sorry if I'm sidetracking the micro-time-frame conversation that's going on. I've been trying to apply your testing ideas on a less-than-HFT scale. Specifically, I'm testing 1-3 years of NQ (with NT8). The timeframes I'm testing are something like 50-1000 ticks and 10-60 seconds. Only testing from 8:40-11:00 (CST).

Every time I get good results, I find out I was only curve fitting - my strategy breaks with the slightest change in variables (i.e. I'm collecting "volatility" or range like you mentioned early in this thread, and when I enter trades with a max volatility of 10 I'm profitable, but when change the max volatility to 9 or 11 the strategy goes downhill drastically).

I think the problem might lie in my original criteria for entering a trade. My first idea was to get a large data set by entering positions based on almost random criteria. I tried entering long/short when a bars volume was greater/less than X. Even tried entering randomly every x bars. The idea was that I could filter this set using a metric or combination of metrics such as your volatility calculation or any number of indicators. I didn't find any luck with this random set. I was able to get from negative to less negative, but I haven't found a viable strategy.

Instead of starting with a random set, I've also tried starting with classic setups that would still yield a great number of entries for me to test. A simple moving average or RSI crossover for instance. This yielded better results than the random entry set, but I still haven't found anything noteworthy.

So here are my questions:

Is it possible to make a large set of random entries into a viable strategy by statistically filtering out losing trades via some metric such as volatility and any other combination of metrics? Or is the key to optimizing this way to have good entry criteria to begin with and build upon that?

Do you have effective methods to guard against curve fitting when you're analyzing?

I have more questions, but I'll leave you with those and let you expound. Thanks again for this awesome thread! I really appreciate all the info.

-Matt

January 28th, 2018, 01:10 AM

Hi Matt,

I believe I can offer some assistance. I will provide you with some detailed instructions on how to get a nice little profitable system off of the ground using simple statistics and basic betting logic. If you are considering using random entries, and you like the idea of playing off of volatility, then here is how you do it.

I haven't looked at the NQ in a while, but I may look at next week to see how it moves. This is the first place you have to start. I am not sure if you are using charts, printing stats to the output window > exporting to excel, etc. But the first step is to get a good sample of data to work with in order to get a baseline understanding of how it moves. Here is what I would recommend.

Create a new strategy and then insert this into the bar update event: Run this on a 1 tick time series and print to your output window. From here copy and paste into excel and go > Data > Text to Columns > Delimiter > Other = /

Print("BarUpdate" + "/"+
Time[0] +"/"+
Time[0].Hour +"/"+
Time[0].Minute +"/"+
Time[0].Second +"/"+
Time[0].Millisecond +"/"+
Close[0] +"/"+
Open[0] +"/"+
High[0] +"/"+
Low[0] +"/"+
GetCurrentBid() +"/"+
GetCurrentAsk() +"/"+
GetCurrentBidVolume() +"/"+
GetCurrentAskVolume() );

This will split the data into distinct columns. All you need is a few days worth of data to get started, (Ideally more...) but from here you can analyze the actual raw data and understand the following:

1. How many ticks need to occur to make the data move in a significant way. Is this 10,25,50 ,etc? I recommend starting with 1 tick > exporting to excel to analyze because at this level you know that every row of data = 1 tick (Trade). So this will give you the most granular data possible.
2. You should try to get a sense of how many trades are needed to typically move a price level (Change in the bid / ask price). Is this 1 trade, 100 trades, etc. This part is important, because for actual trading you should make your tick setting granular enough to be < what typically breaks a price level. If you are > than this, you will have a hard time capturing key moves. So if you were trading 1000 ticks for example this would likely contain 5,10,20 price level changes inside of it. So this would be likely walking through a large factory with only a strobe light blinking on and off every 30 seconds to guide you.
3. Get a sense of what range the market typically moves in. You can do this a number of different ways but the easiest is just to create a column and grab the max value over the last N rows of data, and do the same for Min, the subtract the max from the min, this will give you the range. Then look at the data. And get a sense of the average, the upper end and lower end of this and the overall distribution. I would expect that you see periods of tight ranges and periods of high ranges, but you need to understand what the typical distribution here is.

From this analysis what you should find are:

Low Volatility Ranges: What is the threshold for this. (<3 ticks, < 5 ticks, < 8 ticks, etc.) How long do these typically last for? 50 rows, 100 rows, etc.
High Volatility Ranges: What is the threshold for this. (>5 ticks, > 10 ticks, > 15 ticks, etc.) How long do these typically last for? 50 rows, 100 rows, etc.

There is no right or wrong way to quantify this, but the very important part is that you have to have your analytics and trading signals optimized right for your data size and time frame. For example a 5 period average may mean x to a 10 tick time series but y to a 50 tick series. This is why I recommend starting with the most granular level possible. Once you analyze the data and figure out where to draw the lines in the sands here is how you apply this to your trading.

If you can understand the typical way that a given instrument moves. For example: A typical low volatility period is X with a tick range of < Y..... You can then use this to place prop bets. To get a clean prop bet you have to hold all other variables constant. To this end I recommend the following entry system.

Random rnd = new Random();
int rando = rnd.Next(1, 10); // creates a number between 1 and 10

// Long
if ( rando ==1 || rando ==3 || rando ==6 || rando ==8 || rando ==9 &&

Position.MarketPosition == MarketPosition.Flat)
{
EnterLongLimit(DefaultQuantity,GetCurrentBid(), "Long");

}

// Short
if ( rando ==2 || rando ==4 || rando ==5 || rando ==7 || rando ==10
&& Position.MarketPosition == MarketPosition.Flat )
{
EnterShortLimit(DefaultQuantity,GetCurrentAsk(), "Short");

}

With this entry system you isolate your prop bet down to only a single variable. (Sizing your Exit).
Here is the bet:

If I know the market is going to be in a tight range (Lets say 5 ticks or under) for the next N number of bars (And you will learn what the exact thresholds for this will be based on analyzing the data as I described) then you can give yourself a serious betting advantage by doing the following:

1. Set the profit target inside the low volatility range.
2. Set the stop loss outside of the low volatility range.

Again, you will need to do some experimenting to find the best sweet spots, but the general idea will give you a serious edge and you will beat the expectancy handily. All you are doing is a basic prop bet that the market will stay in a tight range, and from your research you will know how long it will typically stay in this range and what the right threshold is.

When the market moves to a high volatility range, you just run the opposite logic. (higher profit target vs. lower stop loss). The idea here is that every move (Up or Down) will be large so the the probability of a big move is = to the probability of a small move. With random entries you have even odds of hitting both your PT and SL, except your PT is > than your SL. So you will beat the expectancy here as well.

For medium volatility (Range that can't clearly be defined as high or low), just wait this period out (Don't trade). There is no real betting edge here. You should only concern yourself with the two prop bets I mentioned.

Let me know if you have any questions.

Ian

pen15

Ian,

Sorry if I'm sidetracking the micro-time-frame conversation that's going on. I've been trying to apply your testing ideas on a less-than-HFT scale. Specifically, I'm testing 1-3 years of NQ (with NT8). The timeframes I'm testing are something like 50-1000 ticks and 10-60 seconds. Only testing from 8:40-11:00 (CST).

Every time I get good results, I find out I was only curve fitting - my strategy breaks with the slightest change in variables (i.e. I'm collecting "volatility" or range like you mentioned early in this thread, and when I enter trades with a max volatility of 10 I'm profitable, but when change the max volatility to 9 or 11 the strategy goes downhill drastically).

I think the problem might lie in my original criteria for entering a trade. My first idea was to get a large data set by entering positions based on almost random criteria. I tried entering long/short when a bars volume was greater/less than X. Even tried entering randomly every x bars. The idea was that I could filter this set using a metric or combination of metrics such as your volatility calculation or any number of indicators. I didn't find any luck with this random set. I was able to get from negative to less negative, but I haven't found a viable strategy.

Instead of starting with a random set, I've also tried starting with classic setups that would still yield a great number of entries for me to test. A simple moving average or RSI crossover for instance. This yielded better results than the random entry set, but I still haven't found anything noteworthy.

So here are my questions:

Is it possible to make a large set of random entries into a viable strategy by statistically filtering out losing trades via some metric such as volatility and any other combination of metrics? Or is the key to optimizing this way to have good entry criteria to begin with and build upon that?

Do you have effective methods to guard against curve fitting when you're analyzing?

I have more questions, but I'll leave you with those and let you expound. Thanks again for this awesome thread! I really appreciate all the info.

-Matt

January 28th, 2018, 11:13 AM

Thanks Ian, you're awesome.

I have been printing to output and exporting to excel just like you say. Then I'll use data/pivot tables and solver to find my low/med/high volatility thresholds. But you made me realize I've been missing 2 major things. I never went down to 1 tick granularity to observe price movement (30 ticks was the lowest I was testing), therefore there was no way for me to dial in a precise line in the sand as you say. The other thing is that I was basing what I thought the NQ's average average range/rotation was on my experience watching charts. I need to look at the 1 tick data and figure this range out like you say.

And thanks for the random entry code. I like that.

Here are a couple more questions for you:

How often do you optimize your strategy with new market data? Do you input new data every day/week and recalibrate your PT and SL values?

What sample sizes do you optimize on to dial in your PT and SL levels? I have about three years I can test over - should I optimize over the entire 3 years, or something like 3 separate 1-year optimizations?

Thanks again! I'm going to go study some 1-tick data.

Matt

January 28th, 2018, 11:39 AM

Hi Matt,

Getting a large sample size is going to be tricky with 1 tick data, so to start off with I would recommend just sampling 1 day from like 5-10 different months in different years from different seasons / instrument contracts. This will give you a decent cross section that should model randomly enough to capture enough permutations to give you statistical significance.

Once you get some ideas at this level, you could back the data up to 25-50 ticks and run months or even years just to further test and validate your original assumptions. One thing you can do to cut down on the data size when outputting to your output window is to filter it some. Write a simple if statement to capture only rows of data where the bid / ask price changes from the previous bid / ask price. To do this, just create a couple of variables (LastBid / LastAsk) and put these at the end of your OnBarUpdate code, and then on each update, compare the GetCurrentBid to the LastBid if these are != then run your print, but if they are == then skip the print. This will only capture key changes, so you get less extraneous data.

And the key thing you are looking for in all of this is: Finding a good prop bet. Let me give you the following 2 examples so you will understand what you are looking for and what you are not looking for.

1. What you are not looking for: You find a very tiny / tight range of x ticks that the market will sometimes contract down to. This represents extremely low volatility. However the market will only stay in this range for a short period of time. So nearly as soon as it pops down to this, it will pop back to a wider range.

2. What you are looking for. You find a small range, (yet not the smallest possible) that the market will often pop down to and hold in for a long period of time. Long enough, that you could make several prop bets while the market holds in this range.

So in scenario 1 the threshold may be 1-2 ticks, but in scenario 2 the threshold maybe 3-7 ticks. (I admittedly don't know the lines in the sand for the NQ so well.) The idea is to find a range that occurs often enough, and holds long enough that when it hits this range, it will typically stay in this range long enough for you to get in make your prop bets and get out with a few winners.

So not only are you studying the data for a significant low and high range, but also ones that typically hold for a long enough period of time. Does this make sense?

Ian

pen15

Thanks Ian, you're awesome.

I have been printing to output and exporting to excel just like you say. Then I'll use data/pivot tables and solver to find my low/med/high volatility thresholds. But you made me realize I've been missing 2 major things. I never went down to 1 tick granularity to observe price movement (30 ticks was the lowest I was testing), therefore there was no way for me to dial in a precise line in the sand as you say. The other thing is that I was basing what I thought the NQ's average average range/rotation was on my experience watching charts. I need to look at the 1 tick data and figure this range out like you say.

And thanks for the random entry code. I like that.

Here are a couple more questions for you:

How often do you optimize your strategy with new market data? Do you input new data every day/week and recalibrate your PT and SL values?

What sample sizes do you optimize on to dial in your PT and SL levels? I have about three years I can test over - should I optimize over the entire 3 years, or something like 3 separate 1-year optimizations?

Thanks again! I'm going to go study some 1-tick data.

Matt

January 28th, 2018, 12:35 PM

iantg

Hi Matt,

Getting a large sample size is going to be tricky with 1 tick data, so to start off with I would recommend just sampling 1 day from like 5-10 different months in different years from different seasons / instrument contracts. This will give you a decent cross section that should model randomly enough to capture enough permutations to give you statistical significance.

Once you get some ideas at this level, you could back the data up to 25-50 ticks and run months or even years just to further test and validate your original assumptions. One thing you can do to cut down on the data size when outputting to your output window is to filter it some. Write a simple if statement to capture only rows of data where the bid / ask price changes from the previous bid / ask price. To do this, just create a couple of variables (LastBid / LastAsk) and put these at the end of your OnBarUpdate code, and then on each update, compare the GetCurrentBid to the LastBid if these are != then run your print, but if they are == then skip the print. This will only capture key changes, so you get less extraneous data.

And the key thing you are looking for in all of this is: Finding a good prop bet. Let me give you the following 2 examples so you will understand what you are looking for and what you are not looking for.

1. What you are not looking for: You find a very tiny / tight range of x ticks that the market will sometimes contract down to. This represents extremely low volatility. However the market will only stay in this range for a short period of time. So nearly as soon as it pops down to this, it will pop back to a wider range.

2. What you are looking for. You find a small range, (yet not the smallest possible) that the market will often pop down to and hold in for a long period of time. Long enough, that you could make several prop bets while the market holds in this range.

So in scenario 1 the threshold may be 1-2 ticks, but in scenario 2 the threshold maybe 3-7 ticks. (I admittedly don't know the lines in the sand for the NQ so well.) The idea is to find a range that occurs often enough, and holds long enough that when it hits this range, it will typically stay in this range long enough for you to get in make your prop bets and get out with a few winners.

So not only are you studying the data for a significant low and high range, but also ones that typically hold for a long enough period of time. Does this make sense?

Ian

Makes perfect sense as always! I'm coming across an output problem on the 1 tick level though. My bid/ask volume and price are printing the same value. Any idea why this would be?

January 28th, 2018, 01:29 PM

Are you running this in market replay or back-testing? Market Replay is what you want.
When you first load the data up, it may run "historical data" for a period of time. This is likely what you are seeing. I have observed the "historical data" being off like this. Try dropping this at the top of your OnBarUpdate event, and this should kill all the historical data that may be printing.

if (State != State.Realtime)
return;

Ian

pen15

Makes perfect sense as always! I'm coming across an output problem on the 1 tick level though. My bid/ask volume and price are printing the same value. Any idea why this would be?

January 28th, 2018, 01:52 PM

Ah, I was back-testing, not using market replay. It's working now on replay. Thanks!

iantg

Are you running this in market replay or back-testing? Market Replay is what you want.
When you first load the data up, it may run "historical data" for a period of time. This is likely what you are seeing. I have observed the "historical data" being off like this. Try dropping this at the top of your OnBarUpdate event, and this should kill all the historical data that may be printing.

if (State != State.Realtime)
return;

Ian

January 28th, 2018, 02:40 PM

For a number of reasons, (I won't get into it here) you should never use back-testing for anything. Only go with market replay. Even though I have my issues with the accuracy of market replay (Another story, I won't go into here..) it is still a day and night difference in accuracy compared to back testing.

Let me know how it turns out after you work your analysis and implement it.

Good luck!

Ian

pen15

Ah, I was back-testing, not using market replay. It's working now on replay. Thanks!

January 28th, 2018, 03:51 PM

I just printed the close/open/high/low/bid/ask for the same day using 1 tick data on NT8 Strategy Analyzer and on Playback to compare. They match perfectly for close/open/high/low but as I'm sure you know (and I'm just learning) the level 2 dependent columns like bid volume aren't correct except on playback. This should mean that I can at least use the backtester to determine the average range over x amount of ticks using the high/low. The advantage to backtesting being faster results over more data. Once I use this data to narrow down a PT, SL and time frame I will do all my actual strategy testing on playback for accuracy. Does that sound right to you or are there other problems with Strategy Analyzer that I'm not aware of for this use?

iantg

For a number of reasons, (I won't get into it here) you should never use back-testing for anything. Only go with market replay. Even though I have my issues with the accuracy of market replay (Another story, I won't go into here..) it is still a day and night difference in accuracy compared to back testing.

Let me know how it turns out after you work your analysis and implement it.

Good luck!

Ian

January 28th, 2018, 07:22 PM

Matt,

This approach sounds fine. The strategy analyzer will be faster but way less accurate when running entries and exits. If you are just using this for printing then you lose some of the fields as you noted, but otherwise it should work fine.

One of the key issues with the strategy analyzer (as you have just discovered), is that it does not use bid - ask volumes. It doesn't read them, or use them to determine the fills for entries or exits. So this becomes a binary outcome of either overly optimistic or overly pessimistic regarding filling your limit orders. I haven't messed with SA in a very long time, so I couldn't tell you for sure which scenario it is, but if you just look at your fill price you can observe the following.

1. If you got filled on the same price level that you submitted for your entry, where your submitted bid or ask = your fill bid or ask. Then this is overly optimistic and only occurs in the real market 50% of the time or less (Usually way less).
2. If you got filled 1 tick after your entry bid or ask. Implying that the price needed to pass through you to fill you, then this is overly pessimistic as you will be down -1 tick on every entry. Again this does happen in the real market, and honestly quite a bit more than the first scenario if you don't know what you are doing, but to say that this occurs 100% of the time will be overly pessimist.

I just forget which scenario they use... Probably 1 (overly optimistic) because I recall everyone complaining about how they crush it on the SA but get killed in MR. This is likely why.

Ian

pen15

I just printed the close/open/high/low/bid/ask for the same day using 1 tick data on NT8 Strategy Analyzer and on Playback to compare. They match perfectly for close/open/high/low but as I'm sure you know (and I'm just learning) the level 2 dependent columns like bid volume aren't correct except on playback. This should mean that I can at least use the backtester to determine the average range over x amount of ticks using the high/low. The advantage to backtesting being faster results over more data. Once I use this data to narrow down a PT, SL and time frame I will do all my actual strategy testing on playback for accuracy. Does that sound right to you or are there other problems with Strategy Analyzer that I'm not aware of for this use?

Outside the Box and then some....

Discussion in Trading Journals

Outside the Box and then some....