| Pages:
1
..
4
5
6
7
8
..
47 |
admin
Super Administrator
       
Posts: 5060
Registered: 7-4-2017
Member Is Offline
Mood: No Mood
|
|
That's excellent Boothy. My other comment is I felt gold was not tradable the last few years. Just trending badly. Is this still the case?
Also good to try market verification on 25,26,27,28,32,33,34,35 min bars. But this will reduce your sample size greatly.
I used pf 1.8 pearsons 0.95
|
|
|
boothy
Junior Member

Posts: 54
Registered: 21-5-2018
Member Is Offline
Mood: No Mood
|
|
Quote: Originally posted by admin  | That's excellent Boothy. My other comment is I felt gold was not tradable the last few years. Just trending badly. Is this still the case?
Also good to try market verification on 25,26,27,28,32,33,34,35 min bars. But this will reduce your sample size greatly.
I used pf 1.8 pearsons 0.95 |
Yes I did verification on 25 - 35 min bars on GC29.30.31 there was 17 out 1000 systems that past 8/8 with pf 1.8 Pearson .95
For GC/SI 29.30.31 there was 8 out 500 systems past 8/8.
I tried to copy what you did in the CL video comparing the 2018 out of sample with the systems that past 8/8 verification (I think I did correctly)
interestingly, degradation didn’t improve for the systems that past 8/8 ver for 2018. Didn’t get a chance to look much more into it, but I think 2018
in particular a bad year for gold systems.
I still want to do more testing on different fitness and filters for gold systems.
|
|
|
admin
Super Administrator
       
Posts: 5060
Registered: 7-4-2017
Member Is Offline
Mood: No Mood
|
|
Your testing looks great, but 17 or 8 systems results cant be counted on to be statistically large enough sample. I am grateful for all the GSB
community who have given me GSB workers. Some of the research im doing right now is striking and a little unexpected what works best.
Im repeating all my tests to be 100% certain. One setting changed degradation from -19 to -4.5 %, so I have to be sure this is right.
Good you agreed with my comments that gold is such a bad market now. I guess the range is very low.
GSB is a whole universe to explore, and I look forward to more results from you and others in time
|
|
|
admin
Super Administrator
       
Posts: 5060
Registered: 7-4-2017
Member Is Offline
Mood: No Mood
|
|
I have now done the remainder of the video on crude oil.
I show if the results of building systems on multi time frames (29,30,31 & 28,29,30,31,32) and
doing walk forward on 30 vs (29,30,31) vs (28,29,30,31,32)
The implications of this are far reaching for all system development
Comments welcome
Attachment: Login to view the details
|
|
|
edgetrader
Junior Member

Posts: 24
Registered: 16-5-2018
Member Is Offline
Mood: No Mood
|
|
Quote: Originally posted by admin  | I have now done the remainder of the video on crude oil.
I show if the results of building systems on multi time frames (29,30,31 & 28,29,30,31,32) and
doing walk forward on 30 vs (29,30,31) vs (28,29,30,31,32)
The implications of this are far reaching for all system development
Comments welcome
|
Looks good. As you say nth-day and 2018 are using up a lot of the data for OOS testing: When you think of it, the point in doing that is to figure out
the best way to build systems, i.e. what additional bars should be used, how many indicators, etc.
Once you know the best way to build systems, you could apply it to all data in order to make production systems for live trading.
|
|
|
admin
Super Administrator
       
Posts: 5060
Registered: 7-4-2017
Member Is Offline
Mood: No Mood
|
|
agreed. As you say, there is a difference between figuring out the best way to build a system, and building systems.
I used nth=1 then later did verify of 25...35 min bars with nth = all.
Got my first system chosen today, but there were plenty of other good ones. Will publish a TS report tomorrow
|
|
|
admin
Super Administrator
       
Posts: 5060
Registered: 7-4-2017
Member Is Offline
Mood: No Mood
|
|
Here is my first muti market CL system. Verified on all bars from 25 to 35 minute. 2018 is out of sample. I have made 6 other systems today, and
hopefully finished with CL in < 2 days
Attachment: Login to view the details
|
|
|
Bruce
Member
 
Posts: 115
Registered: 22-7-2018
Location: Auckland - New Zealand
Member Is Offline
Mood: No Mood
|
|
Hey Peter, great video and last night I followed your process and just did a quick simulation with the ES using 28 - 32 minute data, individual long
and short systems over the past 5 years to get familiar with your dev process. this immediately yielded amazing initial results which I will explore
further. like your work!
|
|
|
admin
Super Administrator
       
Posts: 5060
Registered: 7-4-2017
Member Is Offline
Mood: No Mood
|
|
Thanks for the comments TradingRails
Its been a massive task, but I now have enough CL systems and have refined the methodology.
Only thing left is long only and short only CL system. CL 29,30,31 minute bar market degradation is amazing
-3.0 % on 29,30,31 minute bars, but awesome -0.5 % on the 30 minute results (of the 29,30,31) bars.
Short only requires massive cpu time, and it will degrade a lot more.
I enclose the results in xml files. These need to be imported into port folio analyst to view
All systems out of sample 1/2/2018
all systems with names like gsbclmx.y were made on 28_32 min bars, wf on 28_32 min bars and verified on 20,40, 24_36 min bars.
(24_36 means 24,25,26,27,28....35.35)
systems with LS in file name made on 30 min bars only and are a long and short only system on the same chart. The short only system is much more
likely to degrade out of sample
Systems like cl23-dj are slight different session times and data streams made on 30 minute bars only
Not all results are current to today. GSBls1 is poor 2018, but the last few months are a big part of the reason,(it traded a lot) and nothing made
money the last few months.
I will include one system for GSB purchasers in the private forum as time permits
Attachment: Login to view the details
|
|
|
admin
Super Administrator
       
Posts: 5060
Registered: 7-4-2017
Member Is Offline
Mood: No Mood
|
|
I did market verification tests on natural gas. -14.6% market degradation with 8958 systems. A great result. Thats on 28.29.30.31.32 minute bars.
I expect it would improve with verification on wider time frames. 2018 results I got -$932 aver profit per system.
Conclusion, NG is a great market, and really bad right now. The low range of NG market in 2018 confirms this. My NG systems also were poor out of
sample 2018. A contrast to crude oil where 2018 results were very good over all, but similar market degradation results.
Im happy to hear any other users experiences, on these or other markets.
|
|
|
Carl
Member
 
Posts: 342
Registered: 10-5-2017
Member Is Offline
Mood: No Mood
|
|
Here are the results on my tests on ES.
I thought, let's use a really big out-of-sample period, just to see what happens.
Build on ES 30
Verif on 25,26,...,34,35
WF on 26,28,30,32,34
No nth used
Last date December 31 2012, so out-of-sample after 2012 (!)
Average net profit 2013-2018: 17.100 USD
Build on ES 28,30,32
Verif on 25,26,...,34,35
WF on 26,28,30,32,34
No nth used
Last date December 31 2012, so out-of-sample after 2012 (!)
Average net profit 2013-2018: 18.900 USD
A most of them had great equity lines out-of-sample.
|
|
|
admin
Super Administrator
       
Posts: 5060
Registered: 7-4-2017
Member Is Offline
Mood: No Mood
|
|
Quote: Originally posted by Carl  | Here are the results on my tests on ES.
I thought, let's use a really big out-of-sample period, just to see what happens.
Build on ES 30
Verif on 25,26,...,34,35
WF on 26,28,30,32,34
No nth used
Last date December 31 2012, so out-of-sample after 2012 (!)
Average net profit 2013-2018: 17.100 USD
Build on ES 28,30,32
Verif on 25,26,...,34,35
WF on 26,28,30,32,34
No nth used
Last date December 31 2012, so out-of-sample after 2012 (!)
Average net profit 2013-2018: 18.900 USD
A most of them had great equity lines out-of-sample. |
Fantastic result Carl, thanks for publishing
|
|
|
admin
Super Administrator
       
Posts: 5060
Registered: 7-4-2017
Member Is Offline
Mood: No Mood
|
|
Ive just thought that contracts info for cash indices should = future indices big point value. CloseDBPV will work best because if GSB genetically
tries data1 futures, then switches to try the cash, the point value will be different and results likely not as good.
|
|
|
cyrus68
Member
 
Posts: 171
Registered: 5-6-2017
Member Is Offline
Mood: No Mood
|
|
Using the secondary filter has become confusing. As I understand it, for futures strategies, it is preferable to use closeBpv if you initially wanted
to use Close-Close; irrespective of whether you want to verify on other markets. For this purpose, you need to redefine the point value of the indices
that are used as secondary data. It is not clear whether this applies to Close-Close Normalised.
If you want to use Close/Close or Close/Close Normalised, it is not clear whether it would be preferable to use the redefined indices or the original
ones.
As for strategies developed on stocks, it makes sense to use the original indices specs, when used as secondary data. So, you will need to define the
appropriate contract specs in the table, for both futures and stocks. For example: SPX and SPX1, with appropriate point values.
|
|
|
admin
Super Administrator
       
Posts: 5060
Registered: 7-4-2017
Member Is Offline
Mood: No Mood
|
|
Quote: Originally posted by cyrus68  | Using the secondary filter has become confusing. As I understand it, for futures strategies, it is preferable to use closeBpv if you initially wanted
to use Close-Close; irrespective of whether you want to verify on other markets. For this purpose, you need to redefine the point value of the indices
that are used as secondary data. It is not clear whether this applies to Close-Close Normalised.
If you want to use Close/Close or Close/Close Normalised, it is not clear whether it would be preferable to use the redefined indices or the original
ones.
As for strategies developed on stocks, it makes sense to use the original indices specs, when used as secondary data. So, you will need to define the
appropriate contract specs in the table, for both futures and stocks. For example: SPX and SPX1, with appropriate point values.
|
correct, GSB is going to have to have a function called bpv where is stores its own bigpoint value, in case we used a different one to ts
|
|
|
cotila1
Junior Member

Posts: 78
Registered: 8-5-2017
Member Is Offline
Mood: No Mood
|
|
Some result to share
I'd like to share some result derived from the discussed approach.
I've built ES systems with TF-28, 29, 30, 31, 32 min bars, SF=CloseLessPrevCloseDBPV. Training=100% and Nth=1. Data from 2000 to 12/31/2014. Meaning
period from 1/1/2015 till today is UN-SEEN.
I stopped gsb after 15000 systems built.
Degradation from Nth=NoTrd (IS days) to Trd(OoS days) is about 14%
I verified all the 15000 systems on TF 25, 26, 27, 33, 34, 35 and I got 5800 systems that are 6/6-saved as excelent. Filter verification used: R=0.95
and Min-PF=1.5 and Min #Trds=100.
I have then verified those 5800 systems over other markets-emd, rut, ym, nq - and I got systems 47 systems with VS=4/4 and 540 systems with VS=3/3,
but for MultiMarket Verification I have used a bit less severe filter verification: Min R=0.90 and Min-PF=1.2 and Min #Trds=100.
I have then WF-ed ALL the 4/4 systems and the 3/4 systems with avg-trd>150 and R>0.98 and NP/DD>20 and #trd>350 (the data period always
1/1/2000-12/31/0214). The TOTAL number of systems wf-ed (as result from first and second verification) is 101. For WF I have used Nth=All.
WF Price data used: 28, 29, 30, 31, 32. As said the total number of systems wf-ed is 101.
After the WFA, I have analized all the systems (101) on the un-seen period (from 1/1/2015 onwards) to realize their behavior (all together) over this
3.5 unseen years and as you can see from table in the picture the NP WITHOUT WF Cur. Params (WFP) goes from a MIN NP of 1250$ to a MAX NP of 38710$.
The only thing I found a bit strange is that the overall results (summerized in the 2 table of the picture) over the unseen period look better WITHOUT
the WF Cur. Params (WFP) rather than WITH. In both cases the results look ok.
btw see also one of the possible equity from this severe selection process: it includes expenses. Same code is plotted even on emd on a totally
different TF (15 min) expenses included. So on a different market and even different TF (15 Min) from the ones the system has been built on. still
looks good.
comments and criticism are more than welome :-)
|
|
|
admin
Super Administrator
       
Posts: 5060
Registered: 7-4-2017
Member Is Offline
Mood: No Mood
|
|
Great and interesting work. Were the systems wf with nth set to all or notrade etc
pf in the table would be good metric to add. Also try wf on 30 29,30,31 and see what worked best oos
|
|
|
cotila1
Junior Member

Posts: 78
Registered: 8-5-2017
Member Is Offline
Mood: No Mood
|
|
Thanks. Good suggestion: try wf on 29,30,31 too. In general, it might be an idea to try wf on a small sample of systems (say 20-30) with 29-31 and
then with 28-32 to see where the improvement sits and afterwords apply to best choice to the enitre set of systems? It might save time while choosing
best option?
Quote: Originally posted by admin  | Great and interesting work. Were the systems wf with nth set to all or notrade etc
pf in the table would be good metric to add. Also try wf on 30 29,30,31 and see what worked best oos |
|
|
|
admin
Super Administrator
       
Posts: 5060
Registered: 7-4-2017
Member Is Offline
Mood: No Mood
|
|
Quote: Originally posted by cotila1  | Thanks. Good suggestion: try wf on 29,30,31 too. In general, it might be an idea to try wf on a small sample of systems (say 20-30) with 29-31 and
then with 28-32 to see where the improvement sits and afterwords apply to best choice to the enitre set of systems? It might save time while choosing
best option?
Quote: Originally posted by admin  | Great and interesting work. Were the systems wf with nth set to all or notrade etc
pf in the table would be good metric to add. Also try wf on 30 29,30,31 and see what worked best oos |
|
Problem with small sample is the number of systems is too low to prove that it helped. Even with 1000 vs 2000 systems, I can get big variations in
overall stats. So its best to do WF with them all.
|
|
|
cotila1
Junior Member

Posts: 78
Registered: 8-5-2017
Member Is Offline
Mood: No Mood
|
|
Oh that's good to know. human & machine time is then necessary :-))
Quote: Originally posted by admin  | Quote: Originally posted by cotila1  | Thanks. Good suggestion: try wf on 29,30,31 too. In general, it might be an idea to try wf on a small sample of systems (say 20-30) with 29-31 and
then with 28-32 to see where the improvement sits and afterwords apply to best choice to the enitre set of systems? It might save time while choosing
best option?
Quote: Originally posted by admin  | Great and interesting work. Were the systems wf with nth set to all or notrade etc
pf in the table would be good metric to add. Also try wf on 30 29,30,31 and see what worked best oos |
|
Problem with small sample is the number of systems is too low to prove that it helped. Even with 1000 vs 2000 systems, I can get big variations in
overall stats. So its best to do WF with them all. |
|
|
|
admin
Super Administrator
       
Posts: 5060
Registered: 7-4-2017
Member Is Offline
Mood: No Mood
|
|
What has become apparent, is that its the stressing of the oscillator periods, not the change in OHLC of the bars that gives the much improved out of
sample results when using multiple time frames. For this reason I likely wont make a simulated data stream that has noise added to it. I might however
make a oscillator period stress-or. But this will be after improved exits and secondary filters.
Also likely to make GSB build systems, then look for the filters/exits that work on ALL systems statistically. Might also have the
option to try a filter/exit while building the system. Haven't thought this through fully.
One of the GSB users has also done a bit of work on the energies. Its clear now that much of the volume is at the close of a 30 minute bar. This
implies also if your execution is slow, your fills wont be great. However I now know how to be free from the need to use 30 minute bars. Maybe in the
next video?
|
|
|
cyrus68
Member
 
Posts: 171
Registered: 5-6-2017
Member Is Offline
Mood: No Mood
|
|
I always suspected that when GSB fits the same system to multiple data frequencies, it is the indicator period that is changed to fit each frequency.
Other indicator parameters remain unchanged.
What was always unclear to me was why – in the case of 29 30 31 min bars – it is necessary to select the 30 min bar (i.e. central bar size)? All that
GSB has done is to fit the system to the given frequencies and produce the metrics. Which bar size you select to run WF depends on which had the least
deterioration in the averages test, as well as the OOS performance of the particular system.
There may be mysteries here that I don’t understand. But then, again, using GSB is often similar to solving mysteries.
Stressing indicator periods directly would be a good idea. An even better idea is to stress all parameters.
As for the issue of filters/exits, theoretically speaking, I don’t know enough of the innards of GSB to state whether they should be part of the build
process or implemented afterwards. Having a choice would be useful. Practically speaking, changing filters has a major impact on results. Currently,
we can do this as part of the build process, and test the impact on IS/OOS averages.
|
|
|
admin
Super Administrator
       
Posts: 5060
Registered: 7-4-2017
Member Is Offline
Mood: No Mood
|
|
Quote: Originally posted by cyrus68  | I always suspected that when GSB fits the same system to multiple data frequencies, it is the indicator period that is changed to fit each frequency.
Other indicator parameters remain unchanged.
What was always unclear to me was why – in the case of 29 30 31 min bars – it is necessary to select the 30 min bar (i.e. central bar size)? All that
GSB has done is to fit the system to the given frequencies and produce the metrics. Which bar size you select to run WF depends on which had the least
deterioration in the averages test, as well as the OOS performance of the particular system.
There may be mysteries here that I don’t understand. But then, again, using GSB is often similar to solving mysteries.
Stressing indicator periods directly would be a good idea. An even better idea is to stress all parameters.
As for the issue of filters/exits, theoretically speaking, I don’t know enough of the innards of GSB to state whether they should be part of the build
process or implemented afterwards. Having a choice would be useful. Practically speaking, changing filters has a major impact on results. Currently,
we can do this as part of the build process, and test the impact on IS/OOS averages.
|
Your points are good to discuss.
If we stress things, I just think its common sense to choose the middle period of whats stressed. I'm open to other ideas. I expect most degradation
away from the central period.
Bottom line is stressing indicator periods I think is good enough, but that's not to say that we cant improve. Long term I may add option of random
noise on the data, but doubt it will help. There are more pressing features to add.
regarding parameter stressing.
lets say we have
result =rsi(c,14)*atr(30)*average(c,50)
if result >offset then buy.
the value of offset matters very little, as 3 indicators range -100 to 100 are a big number, so gsb typically has few parameters apart form the
oscillator periods. Secondary filters on CloseD are very insensitive to mild changes.
If we have bollengerband(c,20,0.5) then we have a 0.5 to potentially stress.
So stressing parameters has less value. If we stress say all 3 osc periods up and down independently, we have a lot of combinations. It might be worth
trying but again there are more urgent features need for now.
I agree the option to have filters per system should be a choice, but its got massive danger of a curve fit - which I want to avoid.
|
|
|
cyrus68
Member
 
Posts: 171
Registered: 5-6-2017
Member Is Offline
Mood: No Mood
|
|
On the issue of stressing indicator parameters, we know from Monte Carlo simulation results that randomising trades and introducing noise to indicator
parameters have the biggest impact on results. Changing the starting trade date has little impact. It may well be that indicator periods, rather than
other parameters, are principally responsible for the result. Theoretically possible but, in practice, unknown.
On the issue of selecting the middle bar size, I may be confused or misunderstanding things, but I need to nail it down. In the following example, why
should I select the 25 min bar (middle) rather than the 20 min (less deterioration). This is just an example, as the overall results are lousy.
Let’s look at the results of the Averages test (IS/OOS), when the “Optimise Price Data” field is set to False. In the example, the result for the 30
min dataset (bottom row of the table) is presumably calculated by averaging the metrics for all systems that were applied to 30 min bars. The same
applies to the 20 and 25 min datasets.
The top row of the table is presumably the result of averaging the metrics of all the datasets. But what do the second, third and fourth rows
represent?
We now know that GSB calculates the average period of the indicators for the included data bars – in this case, 20 25 30 min – to create more robust
indicators. So, for example, does the second row represent the result of applying systems based on the average-period indicators on the 20 min
dataset? If so, we can’t see the metrics.
Regarding the issue of running WF. For example, for the 20 min dataset, is it applying systems based on starting values of the original-period
indicators or the average-period indicators?
|
|
|
admin
Super Administrator
       
Posts: 5060
Registered: 7-4-2017
Member Is Offline
Mood: No Mood
|
|
cyrus68, i will reply tomorrow on this.
|
|
|
| Pages:
1
..
4
5
6
7
8
..
47 |