Predicting Box Office Returns using Twitter: Bernardo Huberman from HP Labs

Last week, Technology Voice spoke to Bernardo Huberman, Senior HP Fellow and Director of the Social Computing Lab at HP Labs. For the last two years, he and his colleagues have been focusing on the phenomenon of, in his words, “social attention”. Our world has exploded in terms of the amount of information that we have available to us. But like so many plentiful things, all this information has very little value. As Bernardo puts it, “ The only things in life that have value are the things that are scarce.”

One commodity that is very valuable because it is limited is our ability to give attention to the myriad potential distractions we face each day. Only so much time and energy is available to us to focus on anything. It is the competitive vying for our attention that has led to the constant bombardment from providers of goods and ideas with messages such as “look at me, consume me, buy me, watch me.” Spam is a great example.

Social attention is not measured from a perceptual or psychological point of view, as fascinating as that is, but from how certain things draw our attention as measured by the number of downloads of a given video, the amount of times an article is read, the number of things people discuss, and so on.

Bernardo says, “One interesting follow up of this was my conjecture that where attention goes usually it portends something and the money follows. So that’s why providers want a lot of attention to their content.

“So one thing we decided to do was to see whether or not we could actually predict something about the future by looking at how attention is allocated to certain things that are about to happen… How do people discuss forthcoming movies that are about to open and the rate which they tweet about them on Twitter? Could we come up with predictions about how well those movies were going to do at the box office?

“Now the reason I chose movies is because that is a very clean-cut example of something that you can measure. If I say to you I will predict certain things will happen and they are diffuse, then it is very hard. But [we can] predict the box office revenue at the end of the day or the end of the weekend… We know a number, that number’s public, how much that movie made.

“So, what we did, basically, we chose about 24 movies that were about to open. We followed the rate at which people were tweeting and we were able to – by calibrating to movies that had already opened and [we] saw the attention they were already getting on Twitter – we were able to come up with a fairly accurate prediction about how much money the movie was going to make.”

Bernardo and his team then compared their findings with the Hollywood Stock Exchange (HSX). This is a site where people can buy stock on actors, movies, directors and other aspects of the movie industry. The HSX has a reputation for being a good predictor of box office revenue. The predictions of the HP Labs team worked so well that patents have been applied for and a business mechanism has been created.

The HP Labs team were also able to improve the quality of their forecast for the next week because now that people saw the movie, they were expressing feelings about whether they liked or disliked it, etc. With this feedback, they were able to fine tune and calibrate their predictive methods.

“It is obvious you can use this for other things. We know that it works for movies, you could do that for new products, new trends, and so on. So if you have a way of tapping a social medium, and that’s our idea, you [can] look at the way that attention is allocated to certain things – sometimes you can predict fairly accurately how certain things will happen.”

This predictive technology has very important implications for political as well as all sorts of other marketing campaigns. The work of Bernardo and his colleagues combined with the techniques of sentiment analysis offer a solution to the Provider’s Dilemma.

But there is still important research to be done using these techniques to inquire into aspects of other modern-day social phenomena. The public agenda is no longer being set by the ‘legacy media’. The power of old-school editors lay in dictating the news agenda by their choices as to what featured on the front pages of their newspapers or what the lead story of the main nightly newscast would be. But that ability to influence public debate is fading. The power for setting the terms of what constitutes public discussion is now dispersing from the few to the many.

Increasingly, our main topics of discussion that concern us as a society are what rise up from and through the various social mediums, but as yet no one has identified the mechanism by which this works.

As Bernardo says, “It’s truly something we don’t understand. How is it out of all these chaotic conversations in Twitter, in Facebook, in blogs, in e-mail, and so on, [does] stuff bubble all the way to the top.”

The answer to that question may have more profound implications than being able to create a more accurately-targeted advertising campaign.

2 thoughts on “Predicting Box Office Returns using Twitter: Bernardo Huberman from HP Labs

  1. Very interesting. One tool to find out more about how things ‘bubble up ‘ to be noticed would be to utilise data sets from Digg, Reddit and other news aggregators perhaps. Also, is there some kind of ‘tipping point’ or sudden phase-transition where something goes from being non-viral to suddenly viral. And could this be tied in with the theory of networks. Must read the paper now…


  2. Great read. Pretty soon the internet will take all of our blurbs,tweets, and posts to form a hive mind representation of the human race. Sounds crazy but give it a couple years.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s