Using Twitter to Predict the Markets

Guest article written for AllAboutAlpha.com – the official publication of the Chartered Alternative Investment Analyst (CAIA) Association

originally posted at: http://allaboutalpha.com/blog/2011/10/20/alpha-hunter-using-twitter-to-predict-the-markets/

Our understanding of financial markets in the last few decades has shifted. Professor Johan Bollen (Indiana University School of Informatics and Computing) states, “..I think behavioural economics has now become an accepted part of the thinking on markets- it’s generally accepted that people’s decision making is heavily influenced by emotional state and various other behavioural biases in comparison to previous models that assumed rational decision making in the markets. The markets and individual investors are driven by what you could call ‘irrational considerations’ and emotions play an important role in that.”

In aggregate, markets are an example of an ’emergent process’ which can be defined as”..global patterns of behavior by agents in a complex system interacting according to their own local rules of behavior, without intending the global patterns of behavior that come about. In emergence, global patterns cannot be predicted from the local rules of behavior that produce them. To put it another way, global patterns cannot be reduced to individual behavior…” (Stacey 1996:287). In an economic context, we see agents (all the participants in the market) acting according to their own rules (their own portfolio, strategy, purpose etc) and- on aggregate- this can create unexpected behaviours in the overall global market. Professor Bollen and his team, understanding that the market operates in this way have published research which shows that systems such as twitter, which aggregate human emotion in a very unique fashion, can be shown to have predictive properties concerning (amongst other things) financial markets. In this interview, we talk to Professor Bollen about behavioural computational finance.

Johan Bollen is associate professor at the Indiana University School of Informatics and Computing. He was formerly a staff scientist at the Los Alamos National Laboratory from 2005-2009, and an Assistant Professor at the Department of Computer Science of Old Dominion University from 2002 to 2005. He obtained his PhD in Experimental Psychology from the University of Brussels in 2001 on the subject of cognitive models of human hypertext navigation

Q: Why did you choose twitter?

[Prof. Bollen] This was a bit of a marriage-of-convenience. Twitter data is more ‘available’ than Facebook and other sources, but there are other fundamental reasons. Twitter as a service is very oriented towards people providing short, succinct and relatively immediate ‘in the moment’ updates on what they’re interested in and how they feel- and at a very large scale. At the moment we have around 250 million twitter users! We felt it was the ideal environment for us to test our hypotheses that we could, in fact, gauge the public’s mood state from this type of data- and use that to study and even predict socioeconomic phenomenon of which the market is just one.

Q: How do you determine the ‘mood’ of the market from twitter? Is there a benchmark against which you determine if something is unusual? and how do you determine the word choices?

[Prof. Bollen] The fact is, we don’t really have a ‘ground truth’. There’s no alternate measurement that tells us how the public is feeling at any given moment in time. There’s opinion polls- which people have been doing for years- but there’s been very few ‘benchmarks’ available for this kind of public mood state and online sentiment. It was, therefore, very difficult for us to validate our results. This was one of the reason we decided to correlate against movements the market. Originally we were very interested in determining whether fluctuations of the public’s mood state could be gauged from online social media feeds. We had difficulty getting our results published and the reviews (rightly) criticised us by saying, “you’ve got measurements but you can’t prove their validity and reliability because you’ve got nothing to compare them to…” and that’s where we started to compare the fluctuations against the Dow Jones. What we focus on is measuring fluctuations rather than absolute levels- and then we cross-validate them against a variety of other indicators.

In terms of how we defined and validated our algorithm to determine the public’s mood from twitter feeds, we based that on an existing psychometric model that has been in use for over 34 years, that has proven its validity in the psychometric analysis of individual mood states. We hoped that grounded our algorithm in existing psychometric practice. I think that approach was largely validated by our results.

It’s important to note that most psychometric models in this field make similar assumptions to the one’s we made about how human mood fluctuates along a number of dimensions.

Q: How did you determine the causality of what you were observing?

[Prof. Bollen] In science, one never shows causality. Causality is something philosophers are concerned with, not scientists. I cannot stress this in strong enough words- we have not shown a causal link between the public’s mood state as we measured it from twitter data feeds, and the market. What we’ve shown is a ‘Granger Causality’ which means that one time series seems to be correlated with another, but at a non-zero lag. This means that one of the time series must be shifted forward or backward by three or four days for that significant correlation to be observed. That means that one time series may have predictive information with regards the other.

Why we observe these correlations is undetermined, we simply don’t know! We also don’t know whether there’s any causal mechanism that causes that correlation to happen. My best hypothesis in that regard is that I presume that there’s a third variable that causes both public mood states AND the market to fluctuate at different delays. Our measurements of the public mood states and the markets are correlated at different times only because they are subject to the same factor- albeit one with a greater delay than the other. In simpler terms, it could be that our twitter mood measurement is just a faster measurement of another factor that affects the markets albeit with a greater delay.

Q: Do you think analysing sentiment in this way reveals anything ‘new’ about the psychology of markets, or is it analogous to other quantitative analysis?

[Prof. Bollen] The measurement that we have is not derived from market or economic data. It is derived from the twitterverse- from all these individual users acting as social sensors. When I have a bad day, that has nothing to do with the market! But how I respond to that bad day may be a reflection of a general level of discomfort about how the economy is doing and so forth. All these people’s judgements… not just about the economy- but about the news, weather, and so on… all of the social responses are aggregated and provide a measurement of public mood state that does not necessarily tie into the markets. It’s an out-of-bounds signal. In that sense, it’s pretty unique!

Q: What opportunities does this research provide investors?

[Prof. Bollen] We do show that there is a predictive effect which is quite consistent over time! You would naturally think that could be exploited by people making investment decisions. In our paper we report an 86% accuracy in predicting the up and down movements in the Dow Jones three or four days out. The question is how you turn that into a money making strategy. It could be- for example- that you lose ALL your money in that other 14%! This is something we have been thinking about quite a bit….. We don’t have a ‘ready made’ strategy- but it’s obvious that if you have this kind of predictive information it could be leveraged. If, for example, I tell you that the public on any given day was ‘hostile’… how you turn that into an investment decision is not a trivial question at all.

What does this mean for investors and risk managers?

In any field of human endeavour, we see that multi-disciplinary approaches (combining experts from many fields) can yield spectacular results- economics is no exception. Professor Bollen and his team are combining thinking from physics, psychology, biology, economics and mathematics to create models which more accurately show how the actions of the many can be manifest in the ‘aggregate’ behaviour of the market.

For investors? by understanding this- there are clearly opportunities to make trades which are fundamentally based on the emotion of the market. For risk managers? by understanding how emotions can manifest in market moves (particularly left-tail events) models can be implemented to show the real human-risk in portfolios or instruments.

Using Twitter to Predict the Markets

About the Author

Up Next: Equality