Consider the following situation. You receive an email from a fund manager boasting of his uncannily accurate stock-picking system, one that he promises to verify through stock recommendations over the next 10 weeks.
In the first week, he says a particular stock will rise or fall. He is right, but you are sceptical of what was, after all, a simple 50-50 bet. However, the accurate predictions keep coming; over the next 10 weeks, the fund manager is right every single time.
Given that the odds of being right every time are tiny – less than 1 in 1,000 – you might well decide to invest your money with someone who appears to be a truly skilled stock-picker.
You might think differently if you knew the underlying strategy. The manager randomly picks a stock and sends out 100,000 emails. In the second week, he halves his mailing list, sending it to the 50,000 people who received an accurate prediction in the first week. He continues to halve his mailing list every week.
After 10 weeks, he will be left with 97 people who will, presumably, be dying to invest with someone able to deliver such a perfect track record.
According to Campbell Harvey, a finance professor at North Carolina-based Duke University, a different version of the well-known email scam is being unwittingly practised by many academics and money managers. The result is that seemingly robust investing strategies may be resting on foundations of sand.
"Most of the empirical research in finance, whether published in academic journals or put into production as an active trading strategy by an investment manager, is likely false," says Campbell R Harvey, co-author of the award-winning Evaluating Trading Strategies paper. "This implies that half the financial products [promising outperformance] that companies are selling to clients are false."
Testing methods
The problem, says Harvey, is that the testing methods used to assess potential trading strategies are not rigorous enough, and random flukes are being mistaken for genuine discoveries.
Traditionally, finance researchers have looked for confidence levels of 95 per cent – that is, there is a 5 per cent chance that past results were a fluke – when testing strategies. However, computers now allow for innumerable tests of mountains of data, and results that appear to be statistically significant may be simply the product of chance.
For example, says Harvey, let’s say there is a hypothesis that jelly beans cause acne. Researchers test the proposition by giving jelly beans to one group of people and none to another group, only to find no statistical difference between the two groups.
Someone then suggests that the colour of the jelly bean may be a factor. Twenty new tests are conducted, testing for any ill effects caused by differently coloured jelly beans. No effects are found in the first 19 tests.
On the 20th test, it emerges there is a statistically significant correlation between green jelly beans and acne. The next day’s headline reads: “Green jelly beans linked to acne!”
Investment factors
The jelly bean joke shows that if you test enough strategies, some are bound to work by chance. This data-mining is a real problem in the world of finance, where researchers are always on the lookout for investment “factors” that may have outperformed the market in the past.
All kinds of factors have been tested, ranging from the well-known (the tendency for cheap stocks to outperform expensive stocks) to more obscure potential catalysts of stock returns, such as the effect of advertising or media attention on stock prices.
Each year, more and more market-beating “factors” are being discovered; Harvey found 316 factors documented in the academic literature. Most have been “discovered” over the last decade.
The majority are likely to be false, he says. Harvey notes that one recent academic paper examined whether the first letter of a stock’s name influences its stock price; of the 26 letters of the alphabet, one was deemed to be significant.
This might not matter if the problem was confined to academia. It isn’t: according to Bank of America Merrill Lynch, the number of investing factors used by quantitative investors hit an all-time high last year. Real money is being wasted on strategies whose past success is due purely to chance.
Researchers need to copy the scientific community and become more rigorous in their testing, says Harvey. He refers to the Higgs boson, which was first theorised as far back as 1964. It could only be considered proven, scientists agreed, when the odds of a false discovery were 3.5 million to one. Multiple tests costing some $5 billion were conducted over the following decades, with the Higgs boson only declared officially “discovered” in 2012.
Suspicious
For investors, there are implications arising from Harvey’s research. The obvious one is to be suspicious of money managers promising market-beating returns on the basis of an investment factor that may have outperformed in the past.
It is not that such fund managers are “knowingly selling false products”; rather, many are simply relying on “inappropriate” statistical tools” that prevent them from distinguishing sound investing strategies from flukes. (Harvey admits to making the same statistical mistakes in some of his own published work).
Secondly, investors should be wary when it comes to evaluating fund managers in general. With thousands of managers plying their trade, some are bound to randomly outperform.
Beating the market four or five years in a row may seem like a conclusive demonstration of investment skill, but it may well be all down to luck. (Similarly, a skilled manager may well underperform due to bad luck.)
False discoveries
Thirdly, while Harvey’s research indicates that most investing strategies are based on “false” discoveries, some investing factors have been verified by stronger statistical testing. Value investing – essentially, buying a basket of cheap companies – works. So does momentum investing; rising stocks tend to continue outperforming. Note that value investing and momentum strategies are underpinned by obvious fundamental or behavioural factors.
One way of assessing whether a strategy’s past success is based on chance is to look for a fundamental or behavioural catalyst. Seasonal investors, for example, might say stocks have performed strongly or poorly in a particular month, but is there some fundamental reason why this might continue to be the case in coming years? Does anyone really believe that a stock’s first letter might provide clues as to its future performance?
Investors, Harvey says, “need to realise that they will find seemingly successful trading strategies by chance”.
If you torture the data long enough, as the old joke goes, you can make it confess to anything.