GSB Forums

Not logged in [Login - Register]

Futures and forex trading contains substantial risk and is not for every investor. An investor could
potentially lose all or more than the initial investment. Risk capital is money that can be lost without
jeopardizing ones’ financial security or life style. Only risk capital should be used for trading and only
those with sufficient risk capital should consider trading. Past performance is not necessarily indicative of
future results
Go To Bottom

Printable Version  
 Pages:  1  
Author: Subject: Robustness testing
Carl
Member
***




Posts: 343
Registered: 10-5-2017
Member Is Offline

Mood: No Mood

[*] posted on 16-4-2018 at 05:53 AM
Robustness testing


For me, a trading system is robust if a system performs reasonable on:
1. related markets,
2. different bar sizes (30 -->15 and 60 min) and
3. different sessions.

I am happy to inform you GSB is capable to generate suchs systems!

In this system the secondary filter is "CloseLessPrevCloseD".
For NQ I used a "sfentrylevel" value of 43, for ES 12, for YM 180, for EMD 10.
For all tests I used a stoploss of 350 USD, costs per trade 2.8 and slippage 1 tick per trade.
I did not optimize any of the other parameter values or settings.

Here are some of the tests I performed with the SAME strategy code and the SAME parameter values. I only changed the value of "sfentrylevel".

NQ 30 0830-1500 + NDX.X 30 0830-1500:
NQ 30 data1 + data2.bmp - 575kB

NQ 15 0830-1500 + NDX.X 15 0830-1500:
NQ 15.bmp - 571kB

NQ 60 0830-1500 + NDX.X 60 0830-1500:
NQ 60.bmp - 574kB

ES 30 0830-1500 + SPX.X 30 0830-1500:
ES 30.bmp - 568kB

YM 30 0830-1500 + INDU 30 0830-1500:
YM 30.bmp - 575kB

EMD 30 0830-1500 + IDX 30 0830-1500:
EMD 30.bmp - 573kB

NQ 30 0830-1500 (changed setting to use only data1):
NQ 30 data1.bmp - 573kB

NQ 30 0800-1600 (no data2, added "setexitonclose" to the code):
NQ 30 0800-1600.bmp - 577kB

Using this stategy on NG and CL also delivered profitable results.


View user's profile View All Posts By User
zdata
Junior Member
**




Posts: 9
Registered: 28-2-2018
Member Is Offline

Mood: No Mood

[*] posted on 16-4-2018 at 06:13 AM


Am I correct in assuming that you used one strategy for all these instruments and time frames but changed value for sfentrylevel?, which time frame and instrument was used for initial strategy generation, did you run walk forward on this strategy or used EWFO?.


View user's profile View All Posts By User
Carl
Member
***




Posts: 343
Registered: 10-5-2017
Member Is Offline

Mood: No Mood

[*] posted on 16-4-2018 at 06:26 AM


Yes, for the initial stategy I used:
data1 NQ 30 0830-1500 (January 3 2000 - December 31 2016)
data2 NDX.X 30 0830-1500 (January 3 2000 - December 31 2016)

So real out-of-sample after December 31 2016.

Performed WFO test 300 gen X 300 pop.
The parameter stability was 56 and 52.



View user's profile View All Posts By User
rws
Member
***




Posts: 114
Registered: 12-6-2017
Member Is Offline

Mood: No Mood

[*] posted on 16-4-2018 at 02:25 PM


Excellent Carl!

View user's profile View All Posts By User
admin
Super Administrator
*********




Posts: 5069
Registered: 7-4-2017
Member Is Offline

Mood: No Mood

[*] posted on 16-4-2018 at 07:03 PM


After the great success of GSBsys1NQ, I built some NQ systems. They are much harder than ES.
Using the new Nth feature, I found most NQ systems did poor out of sample.
However 19 out of 20 ES systems I tested, had better out of sample results on Nth than in sample. (I think that's very unusual)
Extracted from updates docs, here is description on Nth.
A much better option than using trading periods for out of sample testing is Nth day.
The reason for this is some years are just very hard to trade, and you don't know if a system failed to make money due to poor market conditions, or being a poor system.
If Nth day is set to 2, after every 2nd day, GSB will not trade on the next day. Once you have build your system, you can set Nth to trade. This will trade only the dates that GSB has not seen. This is an excellent out of sample test.
I ended up having a system that trades NQ ES YM. I didnt put work into ER or EMD




nth-gsbsys3.png - 260kB


View user's profile View All Posts By User
zdata
Junior Member
**




Posts: 9
Registered: 28-2-2018
Member Is Offline

Mood: No Mood

[*] posted on 17-4-2018 at 10:57 AM


Carl
Where is this sfentrylevel settings, I am on 44.27 and dont see it


View user's profile View All Posts By User
Carl
Member
***




Posts: 343
Registered: 10-5-2017
Member Is Offline

Mood: No Mood

[*] posted on 17-4-2018 at 11:59 AM


Hi zdata,

I changed the Sfentrylevel input in tradestation to fit the market

Gsb: view - advanced
Left window under strategy: parameters


View user's profile View All Posts By User
rws
Member
***




Posts: 114
Registered: 12-6-2017
Member Is Offline

Mood: No Mood

[*] posted on 17-4-2018 at 02:01 PM


As soon as you evaluate OOS results of a system before trading it the system is in sample.

A protocol to objectively measure performance of a system generator (setting) could be to generate xxx systems and show how many % of the top n systems were succesfull OOS



View user's profile View All Posts By User
admin
Super Administrator
*********




Posts: 5069
Registered: 7-4-2017
Member Is Offline

Mood: No Mood

[*] posted on 17-4-2018 at 04:13 PM


Quote: Originally posted by rws  
As soon as you evaluate OOS results of a system before trading it the system is in sample.

A protocol to objectively measure performance of a system generator (setting) could be to generate xxx systems and show how many % of the top n systems were succesfull OOS


What you are saying is very important and true.
This is even more dangerous if your using lots of cpu power and only choose the top of many systems. The nth feature isnt fully automated and it means you have as many (few really) combinations as a human can look at (as you need to invert nth to look at OOS) This is an excellent compromise. Factor in ES &NG rarely failed for me OOS. NQ rarely passed but when Nth was used and the OOS was good, the system was excellent being able to trade at least 2 other markets with excellent metrics. There is still the addition safety net of using say 29,30,31 minute bars. Again not needed for the easy markets(but fine to use). Degradation tests of a system build on 30 min bars, when changed to 29,31 (or 27 33 etc) should also give a clue.
On of the problems with RWS idea is how do you measure OOS.
I made a ES system that went poorly the entire 2017 OOS, then blitzed. With the wisdom of hind site, it was just that 2017 was a bad year for ES. Hence I think Nth is a better choice. Every 2 days, I skip 1 day reserved for OOS. This gives a OOS testing period the full length of the data used.
Sorry I didn't also say, Carl what you have done is excellent.


View user's profile View All Posts By User
rws
Member
***




Posts: 114
Registered: 12-6-2017
Member Is Offline

Mood: No Mood

[*] posted on 17-4-2018 at 04:41 PM


Sure OOS would be a personal setting.
For example next half year or even a couple of months, that is
something personal.

I have mentioned this before but it would be very helpfull
if there is an average number shown for the top list of the first n systems.

For example if you rank on NP, the % of systems that have >10k.
So if you have a list of thousands system it would be usefull to show how many% of the top 50 list have more than 10k profit.
In this way it is more easy to see what settings or data cause more systems to pass. I often export to excell to do that but it could be a simple number on top of every colom in GSB.

Of course you could also set a filter that does not show the systems that do have less than 10k but then you don't get a good feeling if a change in data or settings have helped.




View user's profile View All Posts By User
rws
Member
***




Posts: 114
Registered: 12-6-2017
Member Is Offline

Mood: No Mood

[*] posted on 17-4-2018 at 05:07 PM


It could also help with choosing the right optimization target.

Suppose you find that sorting the list on PF would see the
top 100 systems % that have more than 10k OOS going go from 40 to 60%
then it could mean that optimization on PF could help.

It can immediately give objective info which otherwise is more guessing and feeling.



View user's profile View All Posts By User
cyrus68
Member
***




Posts: 171
Registered: 5-6-2017
Member Is Offline

Mood: No Mood

[*] posted on 18-4-2018 at 01:45 AM


Peter

The method of nth day OOS testing appears very useful but I haven't tested it yet.
As I understand it, if you set n=2, it will optimise on 2 trading days and reserve the 3rd day for OOS testing.
I guess you have to estimate how many OOS days it adds up to - given the size of your dataset.
Does this make the setting of optimisation/testing/validation periods irrelevant?


View user's profile View All Posts By User
cyrus68
Member
***




Posts: 171
Registered: 5-6-2017
Member Is Offline

Mood: No Mood

[*] posted on 18-4-2018 at 02:02 AM


On the issue of testing a system's robustness on related markets (e.g. ES, YM, NQ), one thing to remember is that stock markets have been particularly highly correlated over the past 10 years - because of central bank meddling.

If there is a regime change, there is likely to be greater divergence in the behaviour of these markets, with implications for the system that trades them.


View user's profile View All Posts By User
admin
Super Administrator
*********




Posts: 5069
Registered: 7-4-2017
Member Is Offline

Mood: No Mood

[*] posted on 18-4-2018 at 02:04 AM


Quote: Originally posted by cyrus68  
Peter

The method of nth day OOS testing appears very useful but I haven't tested it yet.
(1) As I understand it, if you set n=2, it will optimise on 2 trading days and reserve the 3rd day for OOS testing.
(2) I guess you have to estimate how many OOS days it adds up to - given the size of your dataset.
(3) Does this make the setting of optimisation/testing/validation periods irrelevant?

Correct on (1)
(2) this is done automatically
(3) still fine to use these, but if you have massive CPU power and just go looking for the best over all results, nothing is really out of sample. Nth gets around most but not all these issues.


View user's profile View All Posts By User
rws
Member
***




Posts: 114
Registered: 12-6-2017
Member Is Offline

Mood: No Mood

[*] posted on 18-4-2018 at 02:51 PM


What is your thought about making separate systems for buy and short
vs systems that have both buy and short with respect to sudden market behaviour change?



View user's profile View All Posts By User
admin
Super Administrator
*********




Posts: 5069
Registered: 7-4-2017
Member Is Offline

Mood: No Mood

[*] posted on 18-4-2018 at 05:23 PM


Quote: Originally posted by rws  
What is your thought about making separate systems for buy and short
vs systems that have both buy and short with respect to sudden market behaviour change?


For natural gas, this is a very good idea. But to make short was much harder and you need to relax your expectation of how liner the separate systems equity curves are. When you put them together then it gets more linear. Ive done this on NG but to soon to tell the out of sample. NG the last month or 2 has been a poor market.
Haven't tested this on ES. Ive done long only ES with good success, but not short. I think short is harder.


View user's profile View All Posts By User
boothy
Junior Member
**




Posts: 54
Registered: 21-5-2018
Member Is Offline

Mood: No Mood

[*] posted on 1-6-2018 at 01:52 AM


I have just been testing a NG system 15m data1 and 30m data2. Parameter stability 70/70. dates were 2006 to 2014 for all tests and then at the end ran all the way through to 5/2018.
First I changed bars to 10m/20m then to 30m/60m.
Then tried on HO and CL, both 15m/30m.
CL was the worst but was profitable on all tests.
Then when running on NG all the way to current date, the equity curve is flattening out (which isn't great) but still made a new equity high not many trades ago.
All in all pretty robust I think.



NG_15m_30m_Sys3_Graphs.PNG - 117kBuntitled1.png - 109kBuntitled2.png - 125kBuntitled3.png - 85kBuntitled4.png - 112kBuntitled5.png - 102kBuntitled6.png - 117kB


View user's profile View All Posts By User
Carl
Member
***




Posts: 343
Registered: 10-5-2017
Member Is Offline

Mood: No Mood

[*] posted on 1-6-2018 at 03:22 AM


Hi boothy,

Great results!

How many oscillators did you use?

All rhe best,
Carl


View user's profile View All Posts By User
boothy
Junior Member
**




Posts: 54
Registered: 21-5-2018
Member Is Offline

Mood: No Mood

[*] posted on 1-6-2018 at 03:37 AM


Hi Carl,

the system was built with 3 indicators,

GSB SuperSmoother
GSB SlowK
TrueRange


View user's profile View All Posts By User
admin
Super Administrator
*********




Posts: 5069
Registered: 7-4-2017
Member Is Offline

Mood: No Mood

[*] posted on 1-6-2018 at 03:37 PM


Great to see you post this. Im hanging out for the multi time frame / symbol bars features to try similar things. Progress has been slow this week.

View user's profile View All Posts By User
Carl
Member
***




Posts: 343
Registered: 10-5-2017
Member Is Offline

Mood: No Mood

[*] posted on 2-6-2018 at 01:21 AM



Hi boothy,

The flattening of the NG equity might be caused by the decreasing volatility over the years.

Which indicator was used for the secondary filter (sfResult)?

Carl

NG ATR.jpg - 65kB


View user's profile View All Posts By User
boothy
Junior Member
**




Posts: 54
Registered: 21-5-2018
Member Is Offline

Mood: No Mood

[*] posted on 2-6-2018 at 07:41 AM


Hi Carl,

Thanks for posting that chart, you make a good point.

here is sfResult

sfResult = GSB_Norm2(GSB_CloseOverPrevCloseD of Data(iSFData), 10, 100) of Data(iSFData) * iSFWeight;

boothy


View user's profile View All Posts By User
BigDog
Junior Member
**




Posts: 20
Registered: 25-4-2018
Member Is Offline

Mood: No Mood

[*] posted on 5-6-2018 at 05:15 AM


Carl: Did you factor in commission and slippage in GSB? Or add them later in TS? I find it can make a lot of difference to the end results. Generally, I have found it better to omit them from GSB (which will result in more systems), then filter them out based on AT.

View user's profile View All Posts By User
BigDog
Junior Member
**




Posts: 20
Registered: 25-4-2018
Member Is Offline

Mood: No Mood

[*] posted on 5-6-2018 at 05:20 AM


Boothby: looks like some great results there.

In every NG system I have developed (in GSB and in other ways) the equity curve has the same concave shape. Often the curve is relatively flat in the most recent years.

One factor to consider is what slippage to allow. This is particularly important in NG as the fills can often be poor. I tend to use a conservative slippage estimate of $20 per contract. This severely depletes the profitability of most NG systems.


View user's profile View All Posts By User
admin
Super Administrator
*********




Posts: 5069
Registered: 7-4-2017
Member Is Offline

Mood: No Mood

[*] posted on 5-6-2018 at 10:38 PM


Your comments about NG going flat. Valid. S&P500 was similar, but then 2018 hit ...:)
For referance, here is one of the most linear of my NG systems. High PF and lower NP. Results inc $25 round turn slippage.
Nearly all NG systems are flat at the moment. (This was the best of late) I put that down to very low range. The blue curve is daily range (High-low) and the yellow is abs (Dailyclose-DailyOpen)


ng-25.png - 35kBng-yearly.png - 79kBng-rpt.png - 70kBng-daily.png - 281kB


View user's profile View All Posts By User
 Pages:  1  

  Go To Top

Trademaid forum. Software tools for TradeStation, MultiCharts & NinjaTrader
[Queries: 67] [PHP: 33.6% - SQL: 66.4%]