GSB Forums

Not logged in [Login - Register]

Futures and forex trading contains substantial risk and is not for every investor. An investor could
potentially lose all or more than the initial investment. Risk capital is money that can be lost without
jeopardizing ones’ financial security or life style. Only risk capital should be used for trading and only
those with sufficient risk capital should consider trading. Past performance is not necessarily indicative of
future results
Go To Bottom

Printable Version  
Author: Subject: Double-Blind backtesting- Feedback wanted
boosted
Junior Member
**




Posts: 73
Registered: 16-6-2017
Member Is Offline

Mood: No Mood

[*] posted on 25-6-2017 at 07:20 PM
Double-Blind backtesting- Feedback wanted


I've been reading this blog http://jonathankinlay.com/2014/11/building-systematic-strate... for more than a year, and came across the example of a double-blind OOS backtesting methadology.

I proposed the question to Peter and he recommended I post it here for other forum members to chime in and give any substantive feedback on this methadology and how it can be implemented in GSB.

Also, the writer vaguely references a new method (circa 2014) he had created to develop strategies quicker, using less data and producing more robust strategies. If you look at his TradeStation reports at the bottom, you can see he is using Rangebars of some sort that maybe he pre-processes before placing in GP algo.

Your feedback and any idea's about what this "newer" way to produce strategies might be, what data is being used, how is it processed is welcome.

Here is a snipit of the blog discussing some of the points I mentioned above.



Advances
Firstly, we have evolved methods for transforming original data series that enables us to avoid over-using the same old data-sets and, more importantly, allows new patterns to be revealed in the underlying market structure. This effectively eliminates the data mining bias that has plagued the GP approach. At the same time, because our process produces a stronger signal relative to the background noise, we consume far less data – typically no more than a couple of years worth.

Secondly, we have found we can enhance the robustness of prototype strategies by using double-blind testing: i.e. data sets on which the performance of the model remains unknown to the machine, or the researcher, prior to the final model selection.

Finally, we are able to test not only the alpha signal, but also multiple variations of the trade expression, including different types of entry and exit logic, as well as profit targets and stop loss constraints.

OUTCOMES: ROBUST, PROFITABLE STRATEGIES

outcomes

Taken together, these measures enable our GP system to produce strategies that not only have very high performance characteristics, but are also extremely robust. So, for example, having constructed a model using data only from the continuing bull market in equities in 2012 and 2013, the system is nonetheless capable of producing strategies that perform extremely well when tested out of sample over the highly volatility bear market conditions of 2008/09.

So stable are the results produced by many of the strategies, and so well risk-controlled, that it is possible to deploy leveraged money-managed techniques, such as Vince’s fixed fractional approach. Money management schemes take advantage of the high level of consistency in performance to increase the capital allocation to the strategy in a way that boosts returns without incurring a high risk of catastrophic loss. You can judge the benefits of applying these kinds of techniques in some of the strategies we have developed in equity, fixed income, commodity and energy futures which are described below.

CONCLUSION

After 20-30 years of incubation, the Genetic Programming approach to strategy research and development has come of age. It is now entirely feasible to develop trading systems that far outperform the overwhelming majority of strategies produced by human researchers, in a fraction of the time and for a fraction of the cost.

Please visit the link to read the whole article and see more samples.

SAMPLE GP SYSTEMS

Sample



emini1.png - 361kB


View user's profile View All Posts By User
admin
Super Administrator
*********




Posts: 5069
Registered: 7-4-2017
Member Is Offline

Mood: No Mood

[*] posted on 25-6-2017 at 08:13 PM


I am interested to see what can be learnt from this post, and my assumption is that its legitimate.
What I do find hard is the short profit factor is 123.70, 92% wins, 1028 trades, largest loss $92, average win $284.
Normally I would dismiss this as hype, but it seem to come from a credible source.
Also range bars are not going to work with alternative data streams - which I consider vip.
GSB forum has a lot of experienced traders and system builders, and collectively we can do more together, than what we can do as individuals.
Time to hear from the GSB community..


View user's profile View All Posts By User
boosted
Junior Member
**




Posts: 73
Registered: 16-6-2017
Member Is Offline

Mood: No Mood

[*] posted on 27-6-2017 at 11:28 AM


I've also been reading for past year this trader / developers blog about systematic trading. This article at his blog http://www.financial-hacker.com/better-tests-with-oversampli... sounded like it might be a possible method that is being used to improve robustness.

I'm not an engineer or math whiz so I don't understand the math behind the idea, but get the general idea what he's doing. I think this is at the very least something to look at.


View user's profile View All Posts By User
rws
Member
***




Posts: 114
Registered: 12-6-2017
Member Is Offline

Mood: No Mood

[*] posted on 27-6-2017 at 02:31 PM


This article is from JCL from Zorro in Germany. They have a group
designing games but their second hobby is designing an alternative platform for Metatrader.
They specialize in Forex and also sell some of their system with mixed results. They have a genetic code generator that does something with
price based machinelearning. Their newsgroup has a lot of followers
and some have luck with the systems but not too many.
They integrated the machine learning/statistical language R in their code about one year ago. They have a language that is many times faster than Metatrader for backtesting but not as userfriendly and complete as Amibroker which is probably even faster.


Quote: Originally posted by boosted  
I've also been reading for past year this trader / developers blog about systematic trading. This article at his blog http://www.financial-hacker.com/better-tests-with-oversampli... sounded like it might be a possible method that is being used to improve robustness.

I'm not an engineer or math whiz so I don't understand the math behind the idea, but get the general idea what he's doing. I think this is at the very least something to look at.


View user's profile View All Posts By User
boosted
Junior Member
**




Posts: 73
Registered: 16-6-2017
Member Is Offline

Mood: No Mood

[*] posted on 27-6-2017 at 05:01 PM


Here is the link to J Kinlay's article about double-blind backtesting. http://jonathankinlay.com/2014/06/how-not-to-develop-gp-stra...

View user's profile View All Posts By User
admin
Super Administrator
*********




Posts: 5069
Registered: 7-4-2017
Member Is Offline

Mood: No Mood

[*] posted on 27-6-2017 at 05:27 PM


I think that's a sound article. The system at the end with the nice equity curve has a big and possible big nasty stop, and small profit target. The equity curve gives this away.
I have had comments from a few GSB users that GSB can do this well too, though I havnt yet done it myself.
One comment on the final unseen data period. ES is at record low volatility, so if results are poor in the last year (I would expect so) it is likely to be market conditions, and may not be that the system has failed.


View user's profile View All Posts By User
boosted
Junior Member
**




Posts: 73
Registered: 16-6-2017
Member Is Offline

Mood: No Mood

[*] posted on 27-6-2017 at 05:46 PM


So if I were to create a double-blind in GSB I would assume I do it as described below:

1) Create IS data series set (i.e. ES30m 2002-2012)
2) Test @ Beginning = True
3) Training% =80 and Test%= 20 Validation%= 0 (Test% is first OOS data sample according to article)
4) Run GSB
5) Take TS script of GSB run and place in chart with data starting (2013-2015+) - this is second OOS period
6) Look at (2013-2015+ OOS) results and determine if there is any merit to further testing

Is this how you would suggest doing the double-blind OOS test as he described in article?

How would you perform the WFA and what period(s) (IS, OOS #1, OOS #2) would you use to do a proper thorough WFA that is robust?

Would you run a WFA in TS then take that and run it through EWFO or some other method?

Step by step with what IS and OOS periods are being used for what part of the WFA process would be very helpful.


View user's profile View All Posts By User
admin
Super Administrator
*********




Posts: 5069
Registered: 7-4-2017
Member Is Offline

Mood: No Mood

[*] posted on 27-6-2017 at 07:34 PM


My comments on your points.
2) I normally use false. But true is still valid. Im more interested in WF in the last few years, than many years ago. If its true the IS period will be recent data.
I dont prefer that.
3) I like training about 40%. Kevin Davies users user even less.
4) The wf is vip. Use anchored which sadly in build 26.x is actually rolling. I am running 27.7 now and its better than 26.x The WF inversion is fixed
and if you want the build I can supply. Im expecting the release of 27.8 in the next few days. 27.7 we know is not 100% yet.
Run enough wf iterations so the results are the same each time. I like to do both genetic and random search space.
Random needs more iterations than genetic. # random space tests default is 2000, which is way to low. Do 10k as a min.
You can skip the put into ts part if you use validation % in GSB, but there is too much temptation to cheat and look at it in GSB.
If you like the results in TS, then check the code for issues. ie indicator1*w1*indcatator2*w2 simplifies to indator1*indicator2*w1w2
You also need to look for redundant logic. ie if WF gives you a weight of zero on an indicator, you should remove the indicator
A lot if this is written in the gsb1sysES documentation, though it was much more painful to do as GSB was less powerful then.
There is less need to use EWFO over time. I still like to use it if im going to trade a system live.
EWFO had to be used if youve added new code into the system that isnt in GSB.


View user's profile View All Posts By User
boosted
Junior Member
**




Posts: 73
Registered: 16-6-2017
Member Is Offline

Mood: No Mood

[*] posted on 27-6-2017 at 08:57 PM


If I understand you correctly, to mimic exactly what J Kinlay did with double-blind testing, I click True for Test @ Beginning and avoid Validation % since that would be essentially "cheating".

He said he used 20% OOS prior to training IS. Then test in TS the remainder of the OOS data price series.

If I did the testing exactly as he outlined, would I just do WF on the TS OOS portion, then put into EWFO for further WFA? Or would I need to WF on the first 20% OOS data then separately WFA the TS OOS sample period?

I didn't see the acronym PROM (Pessimistic Return On Margin) Fitness criteria in GSB or EWFO. Is it part of GSB and EWFO or did I miss it somewhere?

The formula is: PROM = {[AW × (#WT − Sq(#WT))] − [AL × (#LT − Sq(#LT))]}/Margin

#WT = Number of Wins
AW = Average Win
#LT = Number of Losses
AL = Average Loss
A#WT = Adjusted Number of Wins
A#LT = Adjusted Number of Losses
AAGP = Adjusted Annualized Gross Profit
AAGL = Adjusted Annualized Gross Loss
A#WT = #WT − Sq(#WT)
A#LT = #LT + Sq(#LT)
AAGP = A#WT × AW
AAGL = A#LT × AL

My time with the GSB demo has about 4 days left. The current Beta version I am using is broken (TS results not same as GSB and as you said WF inverted) so if you believe 27.7 is improved enough to get matching TS and GSB results and tests then I would like to make the most of what demo time I have left with the newer 27.7 Beta.


View user's profile View All Posts By User
admin
Super Administrator
*********




Posts: 5069
Registered: 7-4-2017
Member Is Offline

Mood: No Mood

[*] posted on 27-6-2017 at 09:06 PM


Prom isnt in either, but there is many other fitness, and combinations that can be used. Fitness type should never make a break a wf test, but clearly from lots of experience i know that some work better than others, and there is no one universal good fitness.
use a few ticks commission in fitness.
In rough order i like these.
np * at
np*pf
np
nd/dd
np/(5 biggest dd)
vbase(m) are my favorites

I sent you a url to the newest GSB. Im happy with it but am certain all the bugs are not fixed. Programmer was not happy that I release it as there are known issues still with TS<>GSB. At this stage the hardest parts of the issue have been done.


View user's profile View All Posts By User
parrdo101
Junior Member
**




Posts: 71
Registered: 18-11-2017
Member Is Offline

Mood: No Mood

[*] posted on 5-12-2017 at 10:31 AM


Quote: Originally posted by admin  

3) I like training about 40%. Kevin Davies users user even less


What are his percents?


View user's profile View All Posts By User

  Go To Top

Trademaid forum. Software tools for TradeStation, MultiCharts & NinjaTrader
[Queries: 39] [PHP: 29.8% - SQL: 70.2%]