Last Updated on 24 November, 2021 by Samuelsson
Many traders backtest their trading strategy on all their available data and conclude whether or not to go live with the strategy. But testing a strategy only on known data put it at the mercy of curve fitting; hence, the need for out-of-sample testing.
In this post, we explain what out-of-sample testing is, why this is important, and how you should plan for it during backtesting.
What is backtesting?
Backtesting is the process of testing your strategy on a sample of past market data to see how it performed historically. It is an essential step in developing a trading strategy. Generally, the process of creating a trading strategy involves the following:
- Generating a trading idea.
- Forming a hypothesis from the idea
- Gathering the data to test your hypothesis.
- Testing your hypothesis on the data.
- Confirming, optimizing, or falsifying your hypothesis
What is in-sample testing?
In-sample testing is simply the test you do on your known data, which is the data you use to confirm or falsify your hypothesis. This form of testing helps you to know how the strategy performs in historical data and it’s subject to optimization and curve-fitting.
The usual thing is to split your dataset into two parts: one part for in-sample testing and the other part for out-of-sample testing.
What is out-of-sample testing?
Also known as walk-forward testing, out-of-sample testing is a kind of testing you do on unknown data to know whether a backtested strategy is robust enough to work in a live market environment. It helps you to check for curve-fitting in a backtested strategy. That is, when you have backtested a trading strategy, you also need to test it on unknown data.
For instance, let’s assume you have data from 2005 until 2021. You can split the dataset into two parts; for example, the in-sample data from 2005 until 2017 and then out-of-sample data from 2018 until 2021. You use the in-sample data for your backtesting and tweaking, while you test for robustness on the out-of-sample data.
However, we are skeptical about dividing your dataset into two parts because of the tendency to “cheat” by looking at the out-of-sample test before you perform the in-sample test. Another shortcoming of using historical data for out-of-sample testing is that it does not mimic live trading. While you get your results in the blink of an eye, you miss the details, and as the saying goes, “the devil is in the details”.
What is sample validation?
A validation or robustness test is the process of confirming your trading strategy using an out-of-sample test. You use it to know whether the in-sample testing results were valid or over-optimized. If validation does not confirm the in-sample test result, you should not go live with the strategy; you should tweak the strategy and perform more tests.
The best method of doing out of sample: demo account and incubation
The best way to perform an out-of-sample test is to use a live demo account, which we call the incubation period. It resembles live trading and gives you the “feel” of how the strategy performs. Additionally, you might discover some small details you never thought of when you did the testing.
However, this can be time consuming because it takes time for the trade setups to appear, especially if the strategy is designed for higher timeframes (4-hourly, daily, or weekly timeframe).
A practical example of in-sample and out of sample test (in sample vs. out of sample)
Let’s examine a practical example of an in-sample test and an out-of-sample test. We test a short strategy on the XLP (the ETF tracking consumer staples). The in-sample period from 1993 until the end of 2017 looks like this:
The in-sample test results are as follows:
- 264 trades
- 4% average gain per trade
- CAGR was 5.88%
- Time spent in the market was 8%
- The profit factor was 3.33.
Here is the equity curve for the out-of-sample test:
The data was from 2018 to May 2021, and the result is as follows:
- 43 trades
- The average gain is 0.41%,
- Time spent in the market is 7.7%
- The CAGR is 5.5%
- The profit factor is 4.98
Although the number of trades was few, we can say that the strategy has performed more or less the same as the in-sample test. But it will make sense to trade it in a demo account to see how it performs before going live with it.
Overall, the equity curve looks like this:
Walk forward optimization
Many traders like to use what is called walk forward optimization to make a better test that can stand the test of time. Walk forward optimization is the process of optimizing (tweaking and backtesting) a strategy using in-sample data and following that up with validation testing in a portion of your out-of-sample data. This process involves frequent in-sample and out-of-sample testing.
This is how it’s done:
Let’s say you have 20 years of data. You may decide to divide the data into 10 equal portions — each portion covers two years. Those two years are then divided into two parts: the first year is for in-sample and the second is for out of sample. You make the best parameters in year one, and test this out of sample in year two. This is repeated ten times and the final results are evaluated to make the final parameters for the strategy. While many people use this, we can’t see any benefits it adds to our trading.
The whole point of doing backtests is to forecast the future, so you need to be careful and patient with your procedures. We believe an out-of-sample test is an important aspect of these procedures. Out-of-sample backtesting is the concept of dividing your historical data into two parts — in sample vs. out of sample — for testing and validating your strategy. You use the in-sample test to make the rules, signals, and parameters, while you use the out-of-sample data to test your rules and signals.
While out-of-sample tests are very good, they are not foolproof. The last stage of an out-of-sample test is the incubation period of many months where you trade it live in a demo account.