Flash Crash, the book on high-frequency trading by Michael Lewis, provides a fascinating insight into how financial markets have been colonised by algorithms.
Nearly gone are the iconic floor traders with their preppy jackets and manic hand signalling, replaced by virtual markets, some visible only to bankers themselves – the dark pools. Quants are the new brokers, and algorithms the tools of their trade.
Political opinion polling is following a similar trajectory. Psephologists, who rigorously analyse election trends, are the new pollsters with Nate Silver, founder of FiveThirtyEight, the most exotic and high profile of this new breed. His success in predicting recent US presidential elections has upped the ante for pollsters, challenging them to enhance the predictive capabilities of their polls by modelling the data in sophisticated ways. Using more intricate modelling is not without risk.
Polling is in the spotlight again, after the failure of all UK polling companies to anticipate a clear Conservative win in the May election. Much of the comment has been typically sensationalist and unfair. That said, it has helped shine a light on how polling is conducted and the British Polling Council is conducting a welcome review .
The modelling currently applied to opinion polls, in Ireland at least, is relatively uncomplicated and reasonably transparent – more adjustments than algorithms. Modelling is used, for example, to adjust for those voters who say they will vote but will not actually vote.
Layer of adjustment
Adjusting for sample bias is another form of modelling, increasingly a source of potential error as the range of data collection approaches expands. Nearly everyone in Ireland has a front door they will open or a phone they will answer, so adjustments for sample bias tend to be minor. Globally, however, the move is to online polling where significant adjustments need to be made to compensate for the fact that only a fraction of the population are members of online panels. It is possible to compensate for this bias, but it introduces another layer of adjustment and adds to the complexity of the model.
And it can get really complicated. Clifford Young, the Ipsos polling expert in the US, uses river sampling (drawing in respondents from numerous online feeds) and Bayesian statistics to build his predictions. And his model works.
So modelling can work and more complex modelling can be justified, but when we complicate things we increase the chances of it going horribly wrong. It may already have.
Harry Enten of FiveThirtyEight has written about poll herding in the US – herding is where polls conducted by different organisations tend to converge over a campaign. Interestingly, the same data also suggests that herding can improve accuracy as each poll almost becomes its own poll of polls. Herding may have been a factor in the recent UK election, but this time polls may have stampeded in the wrong direction. How could this have happened?
Natural temptation
Simply put, herding can happen when the opinion of one analyst, encapsulated in a model, has as much influence on the result as the opinions of a thousand voters. In polling we understand that polling the opinions of a thousand voters is not the same as polling the opinions of the entire population, hence the need for a margin of error. No allowance, however, is made for analyst error and the very natural temptation for an analyst not to maintain confidence in a model that produces results that stand apart from the herd.
Herding may not have been a factor in the recent UK election – we await the report of the British Polling Council – but at the very least it alerts us to the potential for analyst error and highlights the vital importance of transparency when data modelling is used.
It is worth repeating that polling data in Ireland is not extensively modelled and there are not enough published polls to form a strong consensus (and hence pressure to conform), so herding will remain something cattle and sheep do.We can, though, expect further modelling of polling data in line with international trends.
In the case of Irish Times/Ipsos MRBI polls, only demographic weights are used, to align the sample fully with the national population according to known demographics, which is what polling was designed to do. Irish Times/Ipsos MRBI polls do not include a likelihood to vote filter, but the need for such a filter in view of some relatively low turnout levels in recent elections is being explored.
The world around us is unavoidably complex, but unnecessarily complicated. Transparency can unlock this apparent contradiction so we will keep with the practice of publishing both unadjusted and adjusted polling data in the event any filtering or modelling is ever used.
Damian Loscher is managing director of Ipsos MRBI