Last Updated on 11 September, 2023 by Samuelsson
Traders backtest their strategies to know how well they would have performed in the past. But backtesting is prone to curve fitting and does not tell how the strategy will perform in a live market. Forward testing would have been ideal to test the robustness of a strategy in a live market, but it takes a lot of time to complete. Fortunately, traders found a workaround using walk forward optimization (WFO). But what is WFO?
Walk forward optimization, also known as walk forward analysis, is a method for testing the robustness of a trading strategy by finding its optimal trading parameters in a certain period known as the in-sample or training data and then applying those parameters in the following period referred to as the out-of-sample or testing data to know how they perform.
To make it easier for you to understand, we will discuss the topic under the following subheadings:
- What is walk forward analysis?
- Understanding walk forward optimization
- Types of walk forward
- What is the difference between In-Sample and Out-of-Sample data?
- What is an objective function?
- What should the Size of my In-Sample and Out-of-Sample data be?
- How to run a walk forward optimization
- Walk forward optimization tools
- Walk forward optimization vs. robustness: does walk forward optimizations eliminate overfitting?
What is walk forward analysis?
Walk forward analysis (WFA), which is sometimes referred to as walk forward optimization, is a method for testing the robustness of a trading strategy by finding its optimal trading parameters using data from a certain period known as the in-sample or training data and subsequently checking the performance of those parameters in the following period, which is usually referred to as the out-of-sample or testing data.
Thus, walk forward optimization (WFO) is basically in-sample and out-of-sample testing taken to the next level. It is a method of simulating how a trading strategy might perform in real-time by testing the strategy’s optimized input parameters on a portion of the chart data (in-sample data) and then comparing the optimized performance with the remaining un-optimized data (out-of-sample data).
WFO is designed to help traders verify their backtesting results in a short time, without having to wait for the market to create enough new live data into the future. It would be disastrous to rely only on backtesting results to determine whether a trading strategy would be profitable in a real market environment because backtesting is prone to curve fitting and historical performance is not indicative of future results. You can have a great backtesting result, but when the strategy is implemented in a live market, the performance falls apart.
Of course, backtesting can provide you with valuable information, but it is only one part of the evaluation process — it only tells you that the strategy worked in the past and might be promising. You then need out-of-sample testing (walk forward analysis) or forward performance testing to confirm the strategy’s effectiveness in a real market setting before putting your money on the line. To determine the viability and robustness of a trading strategy, there must be a good correlation between backtesting and out-of-sample results or forward testing results.
As a popular form of robustness testing, the out-of-sample testing concept has been modified into many variations, but the WFA seems to give the most thorough analysis. Walk forward optimization works by splitting the data into the training portion and many validation portions and then walking through by optimizing for the best values on the training portion and applying them to the validation portions. The validation portions are then fit together to form what we may theoretically call an out-of-sample equity curve.
Understanding walk forward optimization: how is it different from forward testing, and why do you need it?
If you have been creating trading strategies and have seen how some backtested strategies perform poorly in live trading, you would understand that historical performance is not indicative of future results. More often than not, an excellent backtesting result indicates excessive curve fitting. Such a strategy cannot be robust enough to perform well in a live market. For this reason, a trader has to check for the robustness of his strategy by either doing a forward test or a walk forward analysis.
What is a forward test?
Also known as paper trading, a forward test is a simulation of actual trading. It involves implementing the trading strategy/ system in a live market, so it provides traders with a set of out-of-sample data with which to evaluate the trading strategy/system.
Among forex and CFD traders, a forward test is known as demo trading because it is done with a demo account that is loaded with virtual money. Even in stock trading, some online stockbrokers now offer simulated trading accounts where virtual trading can be carried out as though it’s done in the real market.
Simulated trading uses live market data and offers a pseudo-realistic atmosphere on which to practice trading and test the robustness of a system. Apart from the issue of cherry-picking trades in the case of manual trading, one issue with a forward test is that the data is from the live market, and depending on the timeframe the trading system trades on, it may take time to get enough data that can give a statistically significant test result.
How is walk forward analysis different from forward testing?
So, it’s clear that we can test the robustness of a trading system using a forward test or a WFA. Now, you may want to know how a WFA is different from a forward test.
Well, as we stated earlier, the WFA is a method of simulating the way a trading strategy might perform in real-time by testing a strategy’s optimized input parameters on a portion of the chart data (in-sample) and then comparing the optimized performance with the remaining un-optimized (out-of-sample) data.
So, the key difference between the WFA and the forward test is that WFA makes use of already available data, which it splits into in-sample and out-of-sample portions, while the forward test depends on live market data which comes as time passes.
Why run a walk forward optimization?
You have heard this investment disclaimer — historical performance is not indicative of future results — over and over again. But beyond the mere fact that past performance doesn’t tell what can happen in the future, backtesting results are often at the mercy of curve fitting, which makes the strategy fall apart in live market conditions.
Thus, you have to run a walk forward optimization on your trading strategy/system to test for robustness and reduce the effects of overfitting in your backtests. In trading, curve fitting is the process of adapting a trading system to perfectly suit the historical data so much so that it becomes ineffective in a different market condition. In other words, overfitting adapts your strategy to the noise in the historical data instead of identifying only the signals.
With a walk forward optimization, you are forced to verify that your strategy parameters are adapted to pick only the signals in the historical data. It does this by constantly testing your optimized parameters in out-of-sample (validation) data until you arrive at parameters that pick only the signals.
If you optimize the parameters of your strategy on the entire set of the historical data and proceed to trade on the backtesting-optimized parameters in a live market with a real account, you would be inviting a catastrophe.
In essence, performing a walk forward optimization helps you to answer the following important questions about your trading strategy:
- Will my optimized trading strategy perform well in a live market?
- At what rate can you expect the strategy to make money on live data?
- How will changes in trend, volatility, and liquidity affect the performance in the future?
- How frequently should you re-optimize your trading strategy?
By answering those questions, WFO helps you confirm the forward-trading ability of your strategy — that is, whether your strategy has a life after optimization and is likely to continue performing well in a live market. In other words, WFO identifies whether a strategy has been overfitted or improperly optimized.
Furthermore, WFO can reliably measure the rate of post-optimization profit and risk. With a WFO assessment, you can create a statistical profile of multiple in-sample optimizations and out-of-sample trading periods. Of course, it uses a much larger sample size than testing a single period of live data, so it offers greater statistical validity.
Additionally, a WFO makes it possible to precisely compare and measure the rates of out-of-sample versus in-sample trading profit. So, if a strategy is robust, its future performance should be at levels similar to those achieved during optimization. The WFO also gives a clue about the impact of trend, volatility, and liquidity changes on the performance of the strategy. Although those factors can have a very negative impact on trading performance, your strategy should be able to perform profitably despite market changes, otherwise, it’s not robust. Moreover, with a WFO, you should be able to decipher how often your strategy should be re-optimized for optimal performance.
Why walk forward optimization instead of forward testing
Now, you know why it is necessary to validate the robustness of your trading strategy and all the benefits that come with that, but why choose WFO over forward testing? The answer is simple:
Forward testing is done on a live market, so it is dependent on future market data, which, depending on the timeframe your trading strategy is based on, might take months, years, or even decades to get enough data that can give a statistically significant test result. So, you may be forced to use a test result that does not have statistical validity.
A WFO, on the other hand, is performed on historical data that appears new to the strategy because it is excluded from the backtesting sample. So, all the data needed for the analysis are already available, and you can run the analysis in a short time — a few hours or so. Moreover, you have enough data to get a statistically valid result.
In essence, a WFO offers the freshness of data, as forward testing does, but at the same time and more importantly, saves you a lot of time. You can complete a WFO and launch the strategy the same day.
How a walk forward optimization works
From our discussion so far, you can infer that the walk forward analysis involves a process of splitting your data into two or more portions and using one for backtesting optimization while keeping the other for parameters validation after testing. In other words, the backtesting is not carried out on all the data — but on a portion of the data.
The portion of the data to be used for backtesting is called the in-sample data, while the other portion that will be used for validation is called the out-of-sample data. So, the out-of-sample data still appear “new” to the backtested strategy as if it is being run on a live market — only this time in the past instead of real-time. The rationale is that any strategy that was curve fit would fall apart once it is tested on the “new” data.
Here’s how a walk forward optimization works:
- Splitting of the historical data into several in-sample and out-of-sample portions.
- Optimizing and backtesting the strategy on the in-sample data
- Running the optimized parameter to the out-of-sample data to see how it performs
- Performing this test for each window of data
- Fixing the out-of-sample results together to form what would theoretically pass as an out-of-sample equity curve
This process creates a more realistic assessment of the strategy’s performance because you’re choosing what parameters to use before validation. It is easier to pick the best parameter with hindsight than can ever be possible in live market forward-testing.
What is an objective function in WFA?
In walk forward analysis, an objective function is the parameter or parameters that we try to vary to in our optimizations to get the value that gives the best outcome. As you know, the essence of the optimization process is to find the best parameters that give the best result, so we try to vary certain key parameters in the strategy with the in-sample data and test the outcome on the out-of-sample data. It is this parameter/parameters that we call our objective function.
For example, our objective function could be a higher reward/risk ratio. In this case, we tweak these parameters during the in-sample training to find the best value for the strategy and then validate it with the out-of-sample data.
Most of the time, traders use an objective function that has to do with reward or risk or both. We call this a risk-adjusted variable. Of course, it wouldn’t make sense to make $800 when your risk is $1,600. It is much preferable to risk $150 and make maybe $450 or $600, so you need to adjust your system to reward you a multiple of what was risked.
Other risk-adjusted objective functions you may consider in your system include:
- Returns per average drawdowns
- Returns per maximum drawdowns
- Sharpe Ratio — given as excess returns per standard deviation of the returns
In-sample vs. out-of-sample data
We have been talking about in-sample and out-of-sample data, but what are they? The in-sample data is a portion of your historical data that is used for optimizing the parameters of your strategy, while the out-of-sample data is the other unused portion of the historical data that is used to test/validate the optimized parameters.
How to apportion your data before testing
Before you start any backtesting or optimization, you have to set aside a percentage of the historical data that you will use for out-of-sample testing.
One way to do this is to divide the historical data into thirds and reserve one-third for use in the out-of-sample testing. Then, use only the in-sample data for your initial backtesting and optimizing your parameters.
What should the Size of my In-Sample and Out-of-Sample data be?
The golden rule is that you choose an in-sample data size that is large enough to be used to predict certain behavior in the out-of-sample testing but not too large that it incorporates too many false signals. If in-sample data is too small, it would not cover enough signals to make statistically valid assessments, but if it is too large, it contains too many false signals. Most time, the right size depends on your trading strategy and timeframe.
Let’s illustrate with two examples:
- Small in-sample data size and large out-of-sample data: Let’s assume you want to test your belief that a stock’s behavior in the few days following its quarterly earnings typically affects how it behaves for the rest of the coming quarter. In this case, your in-sample data is a few days, while your out-of-sample data is 3 months.
- Large in-sample data and small out-of-sample: Assuming you have a strategy that says that similar bond futures move the same way, when they don’t move that way, you can short the expensive ones and go long on the cheap ones. In this case, a large in-sample dataset should be used to get a sense of the bond movement and model the behavior. Then, a smaller out-of-sample dataset can be used for validation because you would be regularly adapting your parameters to the recent behavior of the bonds.
Types of walk forward optimization
There are two types of walk forward optimization:
- Rolling walk forward optimization
- Anchored walk forward optimization
Rolling walk forward analysis
This type is the default method of walk forward optimization. As you can see in the picture below, the starting point of each subsequent segment begins a certain number of windows away from the starting point of the previous segment.
In other words, the starting point of each segment steps forward and is never anchored to the primary starting point. Because of the way each segment rolls into the other, this type of walk forward optimization is called the rolling type.
Anchored walk forward
This type is called “anchored” walk forward optimization because the starting point of all segments is the same as the starting point of the first segment. The starting points do not step forward but are rather anchored to the primary starting point.
In effect, the IS portion of each subsequent segment uses all the previous windows and is, therefore, longer than IS portion of the preceding segment. For example, the in-sample for the first window is 3 years, but for the second, it becomes 3+1 = 4 years. The in-sample for the third one is 4+1 = 5 years.
As such, the size of the in-sample grows bigger as the anchored walk forward progresses through the data windows, and the total length of each subsequent segment also becomes longer as the analysis progresses.
Apart from this, the rest of the anchored walk forward optimization process is the same as rolling walk forward optimization.
Walk forward optimization tools
Some trading platforms, such as TradeStation, have special tools for walk forward optimization. For example, you can use TradeStation Walk-Forward Optimizer to perform detailed walk-forward analysis. The tool performs the walk-forward analysis on the results generated by the TradeStation Strategy Optimization process.
Here is how it works: When optimizing a strategy in TradeStation, the test results are stored in a file that will be used by the Walk-Forward Optimizer. The optimized test results file is then opened in the optimizer tool where the WFA is performed. Also, the tool enhances the walk-forward process by performing a cluster analysis, which is a number of incremental walk-forward scenario tests using varying sections of in- and out-of-sample data to help gauge the stability of the tested input parameters. These test results are then displayed in a number of reports, tables, and graphs.
The optimizer also lets you customize the WFA factors for maximum flexibility in testing and analysis so that you can easily ascertain the robustness of your trading strategy. You can also perform a sensitivity analysis, which enables you to study how individual parameters impact the performance of your strategy.
When combined with TradeStation’s Genetic optimization feature, you can effectively test large numbers of input combinations in much less time than with a brute force (all combinations) approach.
Step-by-step method of doing a walk forward optimization
These are the steps to take when performing a rolling WFO as in the picture above.
Step 1: Gather all the relevant data: To run a WFO, you will need at least the price data of the financial product you want to trade. If your strategy requires any other data, such as volume, you will need to get them too. Your data size should be as much as is enough for your analysis — can cover years or decades depending on your timeframe.
Step 2: Split the data into multiple portions: Let’s assume that we are running our walk forward optimization from 1998 to 2008. You may decide to break your data into 10 portions (years) as seen in the chart above. Reserve the portions that will be used for out-of-sample validation.
Step 3: Optimize your strategy parameters: Let say you have chosen your in-sample data size to cover three years, you would run an optimization process in the first three windows of data (1998 to 2001) to find the best parameters. These parameters will be your optimized parameters.
Step 4: Validate those parameters on the next window of data (2001-2002): Assuming you have chosen your out-of-sample data size to cover one year, the 2001-2002 window will be your first window of the validation data. You apply the optimized parameters, you check the performance to know if they have any predictive value in the out-of-sample period.
Step 5: Run an optimization to find the best parameters on the next in-sample data (1999-2002): Here, you repeat step 3 for our second in-sample data to find the best parameters. Your second in-sample data is the next three years from 1999 to 2002.
Step 6: Validate those parameters on the next out-of-sample data (2002-2003): You simply repeat step 4 on your second out-of-sample data window (2002-2003).
Step 7: Repeat this process until you’ve covered all the portions of the data: Your last in-sample data would cover 2004-2007, while the last out-of-sample data would be the 2007-2008 window.
Step 8: Collate the results of all the out-of-sample data testing: Collate the results of your strategy in all the out-of-sample data and put them together to get an out-of-sample equity curve.
Walk forward optimization vs. robustness: Does walk forward optimizations eliminate overfitting?
Well, a WFO does not entirely remove curve fitting, but it reduces to a great extent. One of the reasons why a WFO does not completely remove overfitting is the look-ahead bias. What this bias means is that when we observe a market inefficiency that occurred in the past, we are already biased towards it. So we are simply fitting a system to capture such inefficiency.
For example, if we see that Tesla stock is trending upward, we create a trend-following system to trade it and run a walk forward optimization that supports the strategy. So, no matter how great our walk forward optimization results are, they are simply telling us that a trending strategy worked in the past when there was a strong trend in the stock.
Another issue with WFO is p-hacking or data snooping, which implies doing analysis testing just to get the parameters that perform well in our testing, without having any specific hypothesis. P-hacking ultimately leads to fitting a strategy to market noise.
Out-Of-Sample – What does it mean?