Pseudo-mathematics and financial charlatanism

Providence, RI---Your financial advisor calls you up to suggest a newinvestment scheme. Drawing on 20 years of data, he has set hiscomputer to work on this question: If you had invested according tothis scheme in the past, which portfolio would have been the best?His computer assembled thousands of such simulated portfolios andcalculated for each one an industry-standard measure of return onrisk. Out of this gargantuan calculation, your advisor has chosen theoptimal portfolio. After briefly reminding you of the oft-repeatedslogan that "past performance is not an indicator of future results",the advisor enthusiastically recommends the portfolio, noting that itis based on sound mathematical methods. Should you invest?

The somewhat suprising answer is, probably not. Examining a hugenumber of sample past portfolios---known as "backtesting"---might seemlike a good way to zero in on the best future portfolio. But if thenumber of portfolios in the backtest is so large as to be out ofbalance with the number of years of data in the backtest, theportfolios that look best are actually just those that target extremesin the dataset. When an investment strategy "overfits" a backtest inthis way, the strategy is not capitalizing on any general financialstructure but is simply highlighting vagaries in the data.

The perils of backtest overfitting are dissected in the article"Pseudo-Mathematics and Financial Charlatanism: The Effects ofBacktest Overfitting on Out-of-Sample Performance", which will appearin the May 2014 issue of the NOTICES OF THE AMERICAN MATHEMATICALSOCIETY. The authors are David H. Bailey, Jonathan M. Borwein, MarcosLopez de Prado, and Qiji Jim Zhu.

"Recent computational advances allow investment managers tomethodically search through thousands or even millions of potentialoptions for a profitable investment strategy," the authors write. "Inmany instances, that search involves a pseudo-mathematical argumentwhich is spuriously validated through a backtest."

Unfortunately, the overfitting of backtests is commonplace not only inthe offerings of financial advisors but also in research papers inmathematical finance. One way to lessen the problems of backtestoverfitting is to test how well the investment strategy performs ondata outside of the original dataset on which the strategy is based;this is called "out-of-sample" testing. However, few investmentcompanies and researchers do out-of-sample testing.

The design of an investment strategy usually starts with identifying apattern that one believes will help to predict the future value of afinancial variable. The next step is to construct a mathematicalmodel of how that variable could change over time. The number of waysof configuring the model is enormous, and the aim is to identify themodel configuration that maximizes the performance of the investmentstrategy. To do this, practitioners often backtest the model usinghistorical data on the financial variable in question. They also relyon measures such as the "Sharpe ratio", which evaluates theperformance of a strategy on the basis of a sample of past returns.

But if a large number of backtests are performed, one can end upzeroing in on a model configuration that has a misleadingly goodSharpe ratio. As an example, the authors note that, for a model basedon 5 years of data, one can be misled by looking at even as few as 45sample configurations. Within that set of 45 configurations, at leastone of them is guaranteed to stand out with a good Sharpe ratio forthe 5-year dataset but will have a dismal Sharpe ratio forout-of-sample data.

The authors note that, when a backtest does not report the number ofconfigurations that were computed in order to identify the selectedconfiguration, it is impossible to assess the risk of overfitting thebacktest. And yet, the number of model configurations used in abacktest is very often not revealed---neither in academic papers onfinance, nor by companies selling financial products. "[W]e suspectthat a large proportion of backtests published in academic journalsmay be misleading," the authors write. "The situation is not likelyto be better among practitioners. In our experience, overfitting ispathological within the financial industry." Later in the articlethey state: "We strongly suspect that such backtest overfitting is alarge part of the reason why so many algorithmic or systematic hedgefunds do not live up to the elevated expectations generated by theirmanagers."

Probably many fund managers unwittingly engage in backtest overfittingwithout understanding what they are doing, and their lack of knowledgeleads them to overstate the promise of their offerings. Whether thisis fraudulent is not so clear. What is clear is that mathematicalscientists can do much to expose these problematic practices---andthis is why the authors wrote their article. "[M]athematicians in thetwenty-first century have remained disappointingly silent with regardto those in the investment community who, knowingly or not, misusemathematical techniques such as probability theory, statistics, andstochastic calculus," they write. "Our silence is consent, making usaccomplices in these abuses."

Source: American Mathematical Society