STF discretionary spot Forex system development journal

January 16th, 2013, 10:52 AM

bnichols

Preamble

I spent a lot of time doing research over the holidays and in the process slipped back into a polyphasic sleep pattern (interspersing a few minutes or a few or 20 hours of work with a few minutes to a few hours of sleep, as opposed to monophasic sleep--getting the necessary amount of sleep in one stretch) and was reminded that for me napping and working to the exclusion of all else inevitably becomes my undoing. The associated mental state may encourage lateral thinking for a while but after a number of weeks there is a point at which lateral thinking turns into an inability to pursue any given idea to its outcome--which is necessary to get anything done--and reality starts to unravel. Above all one should plan to return to regular patterns before the hallucinations start

.

In any event mixed results of the last few weeks remind me of the "many failures" quote attributed to Edison who is famous for his napping among other things:

Have to wonder if his "number of ways that will not work" would have been reduced, and hence might have wasted less time, if he slept normal hours

The following summarizes results of the last couple of weeks, such as they are.

Pair Study Population

A "shorter" list of a dozen spot currency pairs was compiled from dozens available based on importance as a major pair, activity, whether the base or quote currency contributes to the USD index, or "all of the above". The list comprises the following (in no particular order):

1. EUR/USD
2. GBP/USD
3. USD/JPY
4. EUR/GBP
5. USD/CHF
6. AUD/USD
7. NZD/USD
8. EUR/JPY
9. AUD/JPY
10. USD/CAD
11. USD/SEK
12. SEK/JPY

As might be expected a number of these are fairly well correlated, especially pairs involving JPY presumably because of its popularity as a carry currency. My favourites so far include EUR.JPY and AUD.JPY because of their relatively high level of activity and relatively large price movements.

Statistics

An activity baseline for each pair for the last 90-120 days of 2012 was established by counting 200 tick bars and recording price extrema in each 30 minute bracket of the (approximately 24 hour) trading day. Here "price extrema in each 30 minute bracket" is no more than the difference between high and low price of a 30 minute bar. The results for each pair were visualized using plots like the one in the figures below for AUD.USD price extrema and bar count:

AUD.USD Price Movement per 30 minute time bracket

AUD.USD Activity measured by 200 Tick bars per 30 minute time bracket

In each plot the curve labelled "Maximum Price Diff" or "Maximum Bars" refers to the maximum value attained in the corresponding 30 minute bracket for all days over the period analyzed. On a frequency (or probability density, "PD") plot of values in a given bracket this maximum value would correspond to the point on the extreme right of the PD.

While I plotted PDFs (probability density functions) for a number of brackets and instruments and any of them would illustrate the last statement, unfortunately none is handy; rather than break to look for one and risk getting distracted by something else and not returning to this journal update any time soon

may add later.

In any case the maximum, standard deviation, median and mean suggest pretty much all there is to see in the bracket PDFs. The difference between the median and mean in general indicates the distribution of price extrema in each bracket is heavily skewed toward smaller extremes, as expected, with the maximum values well beyond e.g. Chauvenet's Criterion for outlier removal. Consequently I spent some time pondering how to deal with the stats since first, a lot of stats metrics assume a normal distribution and second, as I may have mentioned elsewhere on the topic of outlier removal, one trader's outlier is another trader's meat and potatoes especially if the tails tend to be "fat".

In the future, rather than coerce price PDFs into something resembling a normal distribution e.g. by applying a nonlinear function (logs, square root) to the values, instead plan to use " 5 number summary" (descriptions in terms of the min, max, median and quartiles) if the need arises. The standard deviation therefore in the meantime should be taken with a grain of salt.

These stats may say when to expect most bang for the buck (i.e., the most price excursion in the least amount of time, no surprise generally during the European session and especially its overlap with the US session) but nothing about direction of any trade. Since (to paraphrase Al Brooks) there is a bear case and a bull case at every moment, so direction in any bracket likely can't be abstracted--remains entirely a function of a given trader's time frame and method.

Bots

With that in mind I next developed a couple of simple strategies to experiment with bracket based stats, including a random entry bot to provide a baseline and another based on a number of popular bull/bear reversal candlestick patterns. IMO this kind of bot isn't entirely without merit since one of mine (loosely stats based, that trades opening range breakouts for commodity ETFs) has been chugging along more or less profitably since inspired back in the day by Market Profile's 80% rule.

All I want to say about these 2 bots at the moment is that the random entry bot performs more or less as expected, which is to say might even be profitable over the long haul with appropriate money management, but promises to be useful when the time comes to explain why candlestick reversal pattern bot profitability appears to be sensitive to instrument. That is, initial results show the candle bot is overall profitable for most pairs, unprofitable.for others but in at least one case (GBP.USD) remarkably unprofitable in that it's the only bot/instrument combo I can recall whose fortunes reverse to the point of being viable when signals are reversed.

More Research

One apparently blind alley I traversed subsequently also had to do with candle patterns. I wondered what has been forgotten about the other patterns if time has reduced candle pattern trading to e.g., the 10 most popular? Moreover, what do stats say about every possible 2- or 3-candle pattern occurring in any time frame chart of any instrument in terms of the ability of these patterns to predict the next candle?

Perhaps the first question to arise (aside from "who in their right mind would do this", since I can't claim to have been in perfect control of my faculties at the time) is how to go about determining in one's lifetime how well the 100's of thousands of 2- or 3-candle patterns available to us predict the next candle given that it's relatively simple to reduce a chart to a data file that can be input into an analysis program.

Inspired by [MENTION=11447]NJAMC[/MENTION]'s latest work with SVM I decided to

1. write an indicator to dump charts to disk as 2 candle pairs;
2. write a program to preprocess the 2 candle pairs into one 32x32 bit image of the pair and a classifier indicative of the direction of the 3rd (predicted) candle; and,
3. run the images through a slightly modified version C�[AUTOLINK]sar[/AUTOLINK] Souza's SVM based handwriting analysis program.

The upshot of that so far is all I've proved is what C�sar mysteriously warns, that training classifiers "...may take a (very) significant amount of time...". Training on 500 candle pair images took a few minutes and produced disappointing results. Since training on 1000 images took 5-10 minutes and did somewhat better I decided to see what analysis of 10,000 images would produce. Unfortunately the program has been running for 4 days now, no telling when processing will be complete.

While waiting for the 10,000 pair analysis to run I wrote an indicator to dump OHLC sequences keyed to Zig Zag function extrema, which since Zig Zag identifies pivots with swings of one's choosing I find a handy, automatic way to generate large numbers of input vectors and output goals (e.g. "associate this OHLC sequence or this sequence of indicator values with bullish/bearish conditions, buy/sell/don't trade signals") for training pattern recognition machines.

This last endeavour treads ground covered at the beginning of this journal in the work on classification by K-Means clustering and lately by NJAMC in his work with SVMs. It's worth revisiting at least to test the hypothesis that the mixed results I obtained were due to shortcomings in the implementation (software engineering) rather than because it's an unworkable concept. Lastly, I'm revisiting the work in MC since so far MC seems to better support rapid prototyping of what I want to as trader, even if MC.Net and NT may better support what I want to as a programmer.

Trading

Discretionary trading STF spot currency results have improved over last year in the first couple of weeks of this one--at least since I've begun to repair my sleeping habits--mainly because when price starts to move the first thing I ask myself is whether I'm fit to trade and absolutely do not touch the keyboard if the honest answer is no. I miss a lot of opportunities when groggy--too often lately it seems--but more importantly lose far less often. I still find the way trading acumen increases remarkable--as inexorably slow as the process seems it does occur even if I'm at a loss to pinpoint any one change that made a difference. I suspect this may be due to the fact that while I may think I know something about trading, what matters is whether the knowledge has become intuitive (new pathways in the brain created?). Until that point "what we know" is at the whim of emotional and physical condition.

A second factor may be I'm paying more and more attention to price action guided by Al Brooks' books and webinars, although it seems we don't appreciate a trading tip (aren't able to use it) until we've reached a prerequisite level of awareness. For example, in his webinar on Monday Al distinguished between "stop order markets" and "limit order markets" and a light bulb seem to turn on above my head (e.g., it seemed I finally had some grasp of candlestick reversal patterns), and I've been able to apply that bit of info since the webinar.

Code

All of the experimental code referred to above was done in MC PowerLanguage and will be posted once it has been cleaned up a little.

Hi @bnichols,

Looks like fun, keep us posted. If you are using external data, you might want to try the libSVM library as that is MUCH faster than the ACCORD version. ACCORD is nice because I can easily integrate it into NinjaTrader, but for research the libSVM could save you days in time. I had one run on ACCORD that took 36 hours and a similar test on libSVM took only 5 hours (not identical).

Also, for sequencing, I have always thought HMM might work better. I haven't any experience with that library yet, but should be good for developing the probability of something based upon the last sequence of something.

Have Fun!

January 16th, 2013, 06:06 PM

Hi @NJAMC,

Roger HMM for sequences--will likely have another look, for now via Accord.Net. I reviewed the topic briefly some time ago for the reason you suggest but took another path at the time. Happily I'm not a poet (i.e., Robert Frost) so have no problem retracing my steps to go down the Road Not Taken

Regarding the length of time Accord.Net SVM engine takes to process a training suite, not sure what the issue is but coding efficiency aside notice the learning part of the code at least appears to be single threaded. Not having studied the source I don't know if it lends itself to multithreading or some other form of distributed processing (e.g., CUDA), but it may be you and [MENTION=1]Big Mike[/MENTION] have given it some thought.

Edited to add: Visual Studio just advised one of the reference DLLs in the Accord.Net HMM sample app (might have been Controls) requires a .Net version > 3.5 as specified in the project properties. If so this might have implications for NT, which may still depend on 3.5.

Edited to add: Using the Accord.Net framework from the CodeProject site the namespace MarkovSequenceClassifier referred to in the sample HMM app, which probably ought to be in reference Accord.Statistics.Models.Markov, is not (or is no longer) present in the DLL (or apparently in the entire Framework either). Not sure what to make of that at the moment.

In other news it occurs to me another reason HMM might be appropriate is its ability to handle sequences of different lengths during classification, something one has to deal with if using Zig Zag to generate sequences.

January 16th, 2013, 09:03 PM

Should add a note on machine learning feature filters for future reference, to wit that while Zig Zag provides a decent filter for sequence generation, use of a cycle oscillator (e.g. stochastics, MACD in a pinch) is another approach that exploits the fact an oscillator may be better tuned to a given trading strategy and time frame, assuming one uses an oscillator at all.

In other words, the stock Zig Zag logic will give us swings based on a fixed % price change or point spread, whereas an oscillator may give us swings more consistent with the swing a strategy will tolerate. The fact that Zig Zag determines pivots in hindsight, long after the moment has passed, is not an issue any more than the fact cycle oscillators are lagging indicators since we're analyzing a chart to extract features, not trading. [In fact when trading I pay attention only to oscillator divergences, which divergence I (and many others) consider leading indicators, for setups and to slope of the higher time frame momentum oscillator for entry].

The 3 charts (1800-, 600- and 200-tick EUR.JPY) in the figure below illustrates the difference. In the figure the blue box in each chart contains what these days I would consider one long trade since it spans the period in which both Stochastics %D was above 50 and MACD above zero on the 1800 tick chart (left hand side of the figure).

In each chart the Zig Zag indicator with a 0.1 point minimum swing is applied to close price (shown in cyan). A modified Zig Zag indicator, also set to 0.1 minimum swing but which picks the high of the bar for swing highs and the low of the bar for swing lows is shown in yellow.

Considerable difference can be seen between the behaviour of both Zig Zag indicators and the behaviour of the oscillators in the bottom panels of each chart (MACD(5,20,30) and Stochastics(3,5,2) during the "trade". In the case of stochastics this difference is due in large part to the fact there is in general no reliable correlation between size of price movement and size of e.g. %D movement, yet IMO stochastics gives a better idea of what traders are up to at any moment than eyeballing price pivots alone in hindsight.

The point is choice of filter method & parameters greatly influences the nature of the feature vectors extracted.

Having said that the following issue arises. If one is going to consider extracting feature vectors using a (dynamic) oscillator swing filter instead of a (kinematic) Zig Zag filter why not consider filtering by the entire trading methodology? I don't mean to imply trying to filter based on setups predicted by the trading methodology, which (the equivalent of a bot) would be very hard if not impossible to implement, but rather why not extract feature sets filtered by actual trading outcomes somewhat the same as Monte Carlo analyses are applied? In other words, why not capture features or sequences that led to each trade as the input data set, using the outcome as the classification? This approach essentially uses the state of our trading brain at that moment as the filter, presumably comprising everything we know about our trading method albeit distracted as it might be by emotion and fatigue--far more than we could easily code. This way we are working toward encapsulating our trading method in a bot and perhaps even improving it, as opposed to writing a bot that especially in the case of Hidden Markov Models applies criteria we know not what.

Given that geeks like me (like one of a million monkeys coding) will some day accidentally extract the complete works of Shakespeare as a feature set little doubt this approach has been taken by others and if so at this point I'm curious what conclusions were drawn.

January 22nd, 2013, 03:01 PM

Or "I came, I saw, I almost got my butt kicked."

Making this post as a penance. These days I may make a half dozen relatively boring trades a day, pack it in around noon local time and get on with my life. Over time it's easy to slip into a rut and lose focus--kind of like dozing off at the wheel when driving and when it happens we have to take it seriously. In this case it was not so bad, worst moment down only $100 or so, but a mistake is a mistake and any mistake can end up costing real money if not nipped in the bud.

Moreover this error is in a category I thought I mastered a long time ago but apparently not--entering trades impetuously, without thinking them through.

Finally, while there was a time I would congratulate myself at having managed to turn a bad entry around I've learned since then that the market sometimes overlooks mistakes and even rewards bad behaviour occasionally, and when it does we should be very afraid; it's one way the market sets us up to clean us out.

Sequence of events

Last trade of the day around 11:00 EST, trading since London open and tired but happy after a profitable session, turns out not so much happy as over-confident & feeling greedy.

The following sequence of images tries to capture the right hand side of the charts I trade from as I entered the last trade of the day (short 1 contract USD.JPY), the point I realized the mistake I'd made and how the market let me weasel my way out of it.

What I saw: On the 200- and 600-tick charts price in free fall above an abyss devoid of visible support.
What I should have seen: The bear climax bar forming on the 1800-tick chart. The more rapid price movement the greater the vacuum, both bulls and bears standing aside, waiting until it's run its course before going long and closing shorts, respectively.

What I did (next 2 images): Sell at the lowest possible price, on impulse.
What I should have done: Whether I noticed the bear climax bar on the 1800-tick chart or not I should have remembered:
1. there is never an abyss. Just because support is not marked by a line on the chart does not mean it isn't there, like the magic number 50 that I sold into

.
2. one never trades a breakout/spike/gap or any rapidly moving price action without have anticipated it previous to the movement and planned for it and for its reverse.
3. when one hasn't anticipated a movement one waits to see how it plays out. In this case probably 90% of the time one can assume a retracement is imminent, at which point all will be revealed.
4. worst case waiting means a missed opportunity, which is a far better outcome than foolishly taking the trade and risking the worst case outcome of being in the market on the wrong end of a trade--losing a ton of money.

What happened next Price action plays out, indicators catch up, divergences scream reversal. Fight or flight kicks in and the mind finally focuses: bail with a relatively small loss now or double up at high probability resistance and risk a full boat loss later.

What I did: Fight--double up at R1. While I put the short order in place ahead of time I was prepared to yank it. It was now a calculated risk Price action and indicators suggested estimated to be 50/50 so I went with the Hail Mary.

What a full blown reversal looks like Exited with a small profit when price rebounded off R1. As a newbie when price gave me a chance to bail I'd assume instead I'd been right after all--being right was important back then--and I would have stayed in the trade to see how much money I could make. Now I get out the instant the getting is good.

January 22nd, 2013, 06:05 PM

For anyone anticipating the release of code promised in an earlier post or more thoughts on machine learning, it's in process. Code is ready but still collecting my thoughts. The main issue may be whether to wait to publish final analysis of Accord.Net behaviour (and plans for mods to the Accord .Net library), first impressions of the libSVM library and plans for more classification experiments. Hidden Markov Models are on a back burner.

The slow turn around is influenced by the main stumbling block, which remains my obsession with coding my discretionary short time frame method, which has proven itself and ought to be coded. I've done more experimental coding but so far the essence--the brain computer connection--continues to elude me.

Funny how bleeding edge stuff that is not profitable but is easy to talk about comes naturally, but old school stuff--what puts turkey on the table--is so hard. I speculate the bleeding edge stuff is just stats revisited--a new, cool look and feel, just like new, cool computer languages merely wrap the same old machine code in Gen X/Y/Z think.

In my experience, and despite my best efforts, so far computers can't think.

When I was younger someone proclaimed the Interwebz ushered in the foundation of (the equivalent of) SkyNet (non Terminator fans may have to look it up)--computers from microcontrollers to Big Blue and the Dorval & Tokyo Crays all talking to each other. So far no sign of that

January 23rd, 2013, 06:25 AM

bnichols

Or "I came, I saw, I almost got my butt kicked."

So Latin is useful after all, despite what I thought at school!

bnichols

1. there is never an abyss. Just because support is not marked by a line on the chart does not mean it isn't there, like the magic number 50 that I sold into

.
2. one never trades a breakout/spike/gap or any rapidly moving price action without have anticipated it previous to the movement and planned for it and for its reverse.
3. when one hasn't anticipated a movement one waits to see how it plays out. In this case probably 90% of the time one can assume a retracement is imminent, at which point all will be revealed.
4. worst case waiting means a missed opportunity, which is a far better outcome than foolishly taking the trade and risking the worst case outcome of being in the market on the wrong end of a trade--losing a ton of money.

Great! You only have to come up with 96 more and you will have as many as Al Brooks put in the back of his first book!

bnichols

As a newbie when price gave me a chance to bail I'd assume instead I'd been right after all--being right was important back then--and I would have stayed in the trade to see how much money I could make. Now I get out the instant the getting is good.

Interesting and seemingly useful mindset. I would like to know if I understand, but unfortunately I don't think I will.

bnichols

When I was younger someone proclaimed the Interwebz ushered in the foundation of (the equivalent of) SkyNet (non Terminator fans may have to look it up)--computers from microcontrollers to Big Blue and the Dorval & Tokyo Crays all talking to each other. So far no sign of that

They do communicate with each other all the time, the only thing is that they communicate about what a human told them to talk about.

January 24th, 2013, 08:51 AM

Trying hard to get a strategy built in PowerLanguage .Net (MC's C# extension) by market open this AM--my first MC .Net strategy.

First observations:

1. If NT C# is from Venus, MC C# is from Mars.

2. On the complete lack of documentation (by "complete lack of documentation" I mean the PowerLanguage .Net "Help" file and the IDE which is devoid of examples, explanations, Intellisense links to function definitions or even comments in sample code) I'm reminded of one of the favourite phrases of a programmer I used to manage (one in a stable of programmers, programmers very much like thoroughbred horses just much less likely to pull a muscle): "If it was hard to write it should be hard to understand."

January 24th, 2013, 02:56 PM

Momentary side track into ES

January 27th, 2013, 01:58 PM

This post deals mainly with recent work on SVM classification. An updated MC indicator (TDEMA) is attached for reasons mentioned at the end of the post

SVM Testing

Following @NJAMC's lead I've been experimenting with State Vector Machines (SVM) for pattern classification off and on for almost 2 weeks more to gain experience with the technology than to apply it to trading. So far the one issue that stands out with the Accord .Net implementation at least (as NJAMC points out) is that the time to train a feature set appears exponentially proportional to the length of the feature set as suggested by the results below for a generic 2-bar pattern recognizer mentioned in a previous post.

2 Bar Pattern Classifier Results

The test procedure was as follows:

Step 1. An MC indicator ("TD2BarrPatternPredictor2" [sic], attached in the MC .PLA and as a text file) was written to dump a chart to a data file as date/time/bar#-stamped 2-bar patterns in OHLC format normalized to the maximum high price and minimum low price of the 2 bars in the pattern.

In the example shown here a 5-minute chart for EUR.USD was dumped to a data file comprising approximately 28,000 pairs. The selected 2 Bar Pattern and "predicted bar" in the following figure ...

... becomes the following line in the dump file (ignoring the header row):

Code

Date,Time,BarNumber,O2,H2,L2,C2,O1,H1,L1,C1,Expect

[...]

120827,203500,528,0.00,1.00,0.00,1.00,1.00,1.00,0.50,0.75,-1.00

Step 2. Open Office Calc (spreadsheet) formulae were written to reconstruct the 2-bar patterns from the data as 32x32 bitmap images (a pattern of ASCII 1's and 0's in UCI's Optdigits Dataset format) as a sanity check (sample spreadsheet "EURUSD.FXCM_240__1_barPatternDataSimple2.ods" for EUR.USD 240 minute 2-bar patterns attached).

Each 32x32 bit map was divided into 2 halves, 1 half per bar, and "color coded" as follows:
........2.1 uptrend bars (close > open) print as 1's (black) on a background filled with 0's (white);
........2.2 downtrend bars print as 0's (white) on a background filled with 1's (black).

The next 2 figures show spreadsheet reconstructions of 2 patterns, the first preceding a bull bar and the second preceding a bear bar. The actual bar patterns (2-bars + "predicted" bar) as they appear on the chart are inset in the upper right of each figure. The 3rd figure below contains a screenshot of the spreadsheet itself (Open Office Calc spreadsheet attached), showing formulae used to perform the reconstruction.

2 bar pattern followed by bull bar:

2 bar pattern followed by bear bar:

Spreadsheet:

Step 3. The OHLC pair data were then converted to 32x32 bit images using a purpose built program ("SVMBarPatternClassifier", MS VS 2010 project attached).

For example, the pair mentioned in Step 1 above coded as follows:

Code

11111111111111100000000000000000
11111111111111100000000000000000
11111111111111100000000000000000
11111111111111100000000000000000
11111111111111100000000000000000
11111111111111100000000000000000
11111111111111100000000000000000
11111111111111100000000000000000
11111111111111111111100000011111
11111111111111111111100000011111
11111111111111111111100000011111
11111111111111111111100000011111
11111111111111111111100000011111
11111111111111111111100000011111
11111111111111111111100000011111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
0

Step 4. The bit map image file was input into a slightly modified version of C�[AUTOLINK]sar[/AUTOLINK] de Souza's SVM based handwriting analysis program. In the figure the horizontal red arrow points to the encoded 2 bar pattern mentioned in Step 1, above.

2 Bar Pattern Classifier Run Time as a Function of Training Set Size

The run time data listed next is summarized in the plot below:

Quoting

Run 1: 100 vectors
Train start: 0
Train end: 99
Test start: 1800
Test end : 1999
Run Time : 0.188 seconds
Support Vectors: 86
Threshold : 0.5445
Classification Hits : 87/200 (47%)

Run 2 - 200 vectors
Train start: 0
Train end: 199
Test start: 1800
Test end : 1999
Run Time : 1.942 seconds
Support Vectors: 158
Threshold : -0.0996
Classification Hits : 103/200 (52%)

Run 3
Vectors 2000
Train start: 0
Train end: 299
Test start: 1800
Test end : 1999
Run Time : 11.500 seconds
er: 0.01667
Support Vectors: 231
Threshold : -0.4284
Classification Hits : 108/200 (54%)

Run 4
Vectors 2000
Train start: 0
Train end: 399
Test start: 1800
Test end : 1999
Run Time : 14.889 seconds
er: 0.0125
Support Vectors: 302
Threshold : -0.86990
Classification Hits : 107/200 (54%)

Run 5
Vectors 2000
Train start: 0
Train end: 499
Test start: 1800
Test end : 1999
Run Time : 321.293 seconds
er: 0.04
Support Vectors: 368
Threshold : -0.95200
Classification Hits : 105/200 (53%)

Run 6 (3 runs simultaneously on 8 cores)
Vectors 2000
Train start: 0
Train end: 599
Test start: 1800
Test end : 1999
Run Time : 786.204 seconds
er: 0.0433
Support Vectors: 414
Threshold : -0.90289
Classification Hits : 103/200 (52%)

Run 7
Vectors 2000
Train start: 0
Train end: 699
Test start: 1800
Test end : 1999
Run Time : 1134.478 seconds
er: .04
Support Vectors: 493
Threshold : -.86834
Classification Hits : 116/200 (58%)

Run 8
Vectors 2000
Train start: 0
Train end: 799
Test start: 1800
Test end : 1999
Run Time : 2376.665seconds
er : 0.0537
Support Vectors: 527
Threshold : -0.96230
Classification Hits : 114/200 (57%)

Run 9
Vectors 2000
Train start: 0
Train end: 899
Test start: 1800
Test end : 1999
Run Time : 6156.827 seconds
er : 0.05444
Support Vectors: 587
Threshold : -0.88710
Classification Hits : 110/200 (55%)

Run 10 (retrained run 9)
Vectors 2000
Train start: 0
Train end: 899
Test start: 1800
Test end : 1999
Run Time : 6002.761 seconds
er : 0.05667
Support Vectors: 596
Threshold : -0.90012
Classification Hits : 111/200 (56%)

Run 11
Vectors 2000
Train start: 0
Train end: 999
Test start: 1800
Test end : 1999
Run Time : 17042.985 seconds
er : 0.058
Support Vectors: 656
Threshold : -0.97665
Classification Hits : 108/200 (54%)

2 Bar Classifier Run Time vs Training Set Size

Parallel Processing

The Accord .Net based SVM app described here, derived directly from C�sar's, uses fairly lengthy feature vectors (order of 1024 samples in size) and lends itself to distributed processing but more work needs to be done to achieve it. Mods made to C�sar Souza's Handwriting SVM program included the following:

1. it was recompiled with .NET 4 and renamed (versus 3.5);
2. fields were added to the GUI to allow training and test section at run time (previously hard coded)
3. the learning algorithm was wrapped with Task.Factory.StartNew() in anticipation of progress monitoring and use of a cancellation token.

While the most recent version of the Accord .Net library appears to support .Net 4.0 multithreading via the Parallel.For wrapper for the .Run() method of MulticlassSupportVectorLearning class, C�sar's code (and hence this code) does not seem to refer to the latest version.

Other approaches to parallel processing that come to mind include

1. CUDA (for which see the NVIDIA site and threads on nexusfi.com (formerly BMT))
2. MPI.Net and threads on nexusfi.com (formerly BMT)

My own attitude toward developing a parallel processing framework for applications in trading is mixed, somewhat jaded by 2 isolated experiences. The first in the early '90's was implementing the Biham-Kocher plain text attack on an encrypted ZIP archive across a half dozen UNIX boxes [algorithm described in the paper by Biham & Kocher, "A Known Plaintext Attack on the PKZIP Stream Cipher" still floating around on the Interwebz]. The rationale for that work was that a programmer for a company in a large Canadian city had ZIP-encrypted the company's entire accounting system and high-tailed it to Cuba with the encryption key and his daughter in tow. He was demanding a ransom for the key and the company was willing to pay something less than the ransom to anyone who could restore the accounting system in the meantime. It took 2 weeks to implement Biham & Kocher's algorithm (C language, Unix operating system) and 2 weeks computer time to crack the archive & discover the key, which turned out to be a simple alpha string--his daughter's first name concatenated to his, both of which EVERYONE INVOLVED ALREADY KNEW. [Subsequently I offered to give the code to PKWARE to keep it off the street, but when no reply was received published on sci.crypt, snippets of it turning up on ZIP crack sites for quite a while afterward]. Take-away from that episode: throwing technology at a problem may not always be the smartest approach.

The 2nd venture involved helping develop an ultra-cheap video-on-demand system aimed at the hospitality industry, its low cost to manufacture being its claim to fame IMO mainly a consequence of the design principle; namely, a distributed operating system running on a network of microcontrollers that pooled resources (including compute power) by piggybacking on the available broadband video infrastructure rather than trying to cram a supercomputer into every set top box. My responsibility was prototyping the hardware and firmware, which was based on a neural model, and then sitting behind a desk, wearing pretty clothes and taking credit for subsequent breakthroughs made by the development team who turned it into a marketable product. This distributed approach allowed for relatively high sales margins and the company prospered, was bought out. What eventually killed the company (and the work) was not so much the increasing availability of high demand content (euphemism for porn) on the Interwebz which made the walled garden basis of VOD obsolete, but a string of business types who repeatedly bought the company for its cash flow, starved R&D of funding and resold it for the price of its assets. These wheeler-dealers the sort who figure they can fake their way in business by latching on to a cash cow; or more properly perhaps a "cash goose", since they have little or no understanding of why when you cook the goose the golden eggs stop. I see one of these guys was recently arrested on murder charges after fleeing the country--character = destiny it would seem. Take away from that was--why freeeeeeking bother.

Conclusions

There are a host of issues affecting the outcome, from choosing to represent the pattern as a bitmap (which is then reduced by the classifier to a 1D array) to choosing to color the bars and background as done. These issues could be addressed in future work, if there is any. For example, an MC indicator ("TDZigZagPnts2") has been attached in the .PLA file and as text that dumps OHLC data leading up to a significant pivot high or low, in preparation for coding sequences of the sort NJAMC was working with. BTW, what the pivot dump indicator records as an SVM feature vector file need not be OHLC data--could be other metrics one finds useful for trading.

While a "hit" rate for the trained generic 2-bar pattern classifier of between 50% and 60% may mean something to a trader (if nothing else it might suggest what initial stop/profit ratios have to be to maintain Al Brook's Trader's Equation) in this context I prefer to think we could do just as well and moreover get to the Jack Daniels a whole lot sooner if we flipped a coin.

A larger training set in this case does not appear to promise any better results and run times quickly become prohibitive.

Finally, it may be worthwhile as a matter of interest isolating the "TOP 10" 2-bar reversal patterns referred to in a previous post that prompted this study, to see if they produce better classification scores than 2-bar patterns in general. I wrote a strat to detect and trade them some time ago so it's just a matter of dumping the chart instances detected in a chart to a file and running the numbers.

Other Work

I continued to trade during this time, 3 or 4 to 14 pedestrian trades/day in a variety of spot currencies but last week also one experimental sojourn into ES on paper. In addition I will do a trade in gold miner and/or oil company ETFs if a setup occurs. I seem to have passed the point a while ago discretionary trading STF spot currencies where the issue is not whether I can make money but how much I can make; i.e. what amount of risk I'm willing to bear. I'm still trading 1 or 2 contracts split into 2 to 4 targets. While issues remain, the mindset (especially the nature of confidence) is vastly different then (as a newbie) and now. Hopefully I can one day capture whatever it is that makes the method profitable in a bot. So far I've proved (over and over) as others have claimed profitability is more than a list of rules strictly adhered to.

I love the London session, which for me means going to bed early enough to get up bright eyed and bushy tailed prior to 4 AM local time. Not only are currency markets generally active it means 3 or 4 quiet, uninterrupted hours before the rest of the house stirs and I get to experience the sunrise. Good for the soul.

The danger at this point is no longer out of boredom seeing setups where they don't exist but allowing one instrument to draw required attention away from another.

I've attached an updated version of the MC indicator TDEMA (trivial to code in any other language--simply the 15EMA colour-coded for slope). The update added a second monochromatic EMA separated from the first by 1 ATR, the reason being that I wanted a trailing stop target removed from the 15EMA itself which more often than not gets penetrated during insignificant retraces. It appears the 1 ATR corresponds more or less the point in a retrace where the next higher time frame MACD slope becomes negative (for a long trade) or positive for a short trade, the slope of the next higher time frame MACD being what I tend to use to bail from a trade before targets are reached.

The 1 ATR also serves as a kind of chop indicator as can be seen in the chart below. I know chop indicators are a dime a dozen and don't work if we rely on them, but I find they can lend a little peace of mind nevertheless. I also know what price action purists think of indicators in general but bots depend on some sort of price action metrics and until I code Al Brooks MAs will have to do

January 28th, 2013, 05:15 AM

What was advertised in Step 3. in the previous post as a project to convert scaled OHLC pair values to bitmaps and posted as "SVMBarPatternClassifier" is in fact the modified version of the bitmap classifier itself referred to in Step 4 of the previous post.

The correct project for Step 3 (that converts OHLC pair data to bitmaps) is attached to this post ("SVMBarPatternRecogition")

STF discretionary spot Forex system development journal

Discussion in Trading Journals

STF discretionary spot Forex system development journal