Agent Based Models Offer Better Predictions

Statistical forecasting is a familiar technique where lots of data points are gathered and assessed. An aggregation is formed and the analysis of that aggregated information is used to predict everything from elections to global warming.

This method, as pioneered by W. Edwards Deming, was used as an industrial tool while working to help create a viable manufacturing industry, virtually from scratch, from amidst the burnt out wreckage of post-war Japan. Now, nearly all modern industry uses statistical probability in their manufacturing and engineering processes to great effect as long as precision is not confused with accuracy.

However, statistical approaches that are ideal for maximizing the performance of machines encounter serious issue when it comes to working with humans. That is because even when we are given a highly restricted amount of options from which to choose from it doesn’t follow that the optimal decision gets made. This source for unpredictably by itself not such a big deal with the involvement of the properly trained and competent people working in a team. But even within clearly defined parameters the possibility always exists that, even with good intentions and sound thinking, decisions can get made which could have disastrous consequences.

In any system where human are required to make a decision options and methodologies can vary from person. They can vary because of influences from environmental conditions and also in how the information presents itself.

Technology, through the use of robots, automated systems, and computerization has at every stage of its growing capability been used to take humans out of the industrial process but human involvement can never be eliminated.

Even in the well-resourced and vast global systems; commercial, political, military and environmental, it is the individual doing the unpredictable thing that can invalidate complex, deeply wrought models of statistical forecasting based on petabytes or more of data.

We know humans are unreliable. We all think it is wise to get a second opinion on important medical matters even if we have faith and ability in the training and competence of the doctor that we are dealing with. So, the question becomes, can we account for our unpredictability in predictable ways?

Michal Kosinski, Operations Director at the Cambridge University Psychometrics Centre, thinks it is possible. In an article soon to be published by the Proceedings of the National Academy of Sciences he suggests that minor or incomplete digital information that exists about our individual selves when collated and aggregated with hundreds of thousands of people roughly like us can serve to make accurate predictions about what we will do next.

The testbed used for his theory was Facebook. Using a sample size of 58,000 and measuring just their activity using the Like button alone they were able to make predictions of such accuracy that the information, according to the researchers could be, “Worthwhile for advertisers.”

[NB: When the PNAS article is published Technology Voice will be covering the findings in greater detail.]

But the application of this work goes beyond selling stuff on a social network site. By being able to make predictions in the very area where predictive power is weakest could save us from all sorts of disasters. Apart from natural disasters nearly all catastrophes can be traced to human error of some sort.

The classic failure of statistical forecasting as a predictive tool was seen in the recent financial meltdown. Financial Services was, and still is, a massively data-ized industry. Yet the gazillion bits of intensely analyzed data and the resulting highly thought-through prognostications were rendered illusory when the economy disappeared.

The idea of econometrics as a trustworthy and reliable tool for policy-makers evaporated along with it. But this attitude only shows a lack of understanding on what the limits are on what can be gleaned from an approach that is based purely on statistical forecasting. The failure was not in the data but in how people used and interacted with (or ignored) the data.

However, by focusing on people are likely to do rather than what information on its own does we can perhaps model intensely complex systems such as economies in a far more realistic and useful manner.

Approaching the same issue from another angle is the research is taking place in what are called, agent-based models, which in turn have been developed from complexity theory, (the theory that enables companies to move things from A to B around the world but often, and counter-intuitively, not by the shortest route.) It looks very promising and the work being done by projects such as CRISIS – Complexity Research Initiative for Systemic InstabilitieS is beginning to attract wider attention and is receiving major funding.

The idea of developing agent-based models is to predict what human decision-makers in a given system will do. That agent could be a pilot in a plane, a manager tasked with a project or a customer picking up something from the supermarket on the way home from work. All humans in a system – all bringing their own rationales and temperament to the process.

Apart from the obvious benefit to marketers, being able to predict how people or individuals will make decisions in a given situation while working within the confines of a process or a system would help to reduce randomness and increase reliability.

However, it seems we now have the very odd situation where the abstracted you, which has been formulated by the aggregation of data derived in part or in whole by vasts number of people who are somewhat like you, is more likely to behave like you than the real you. Where’s Douglas Adams when you need him?

Social Media: A New Frontier for Researchers

Following a presentation I gave at the European Intersectoral Summit on Research and Innovation last month entitled ‘Engaging Citizens in Research and Innovation: Opportunities and Challenges Afforded by Social Media’, I was interviewed by Orlaith Finnegan from Digimind on this topic. A copy of the interview is below.

1) Can you start by telling us about your background and where your interest in social media comes from?

My background in social media started in 1998 when I set up a discussion forum to talk about a multiplayer computer game called Quake. This evolved into, a company I co-founded with some fellow computer gamers in 2000, which is now Ireland’s largest general discussion forum and is part of Distilled Media.

Forums are the original social media on the Web, predating the term social media, but exhibiting the main feature of people connecting through shared “social objects” of interest – in our case, the original social object was the game we all played together and discussed on the forum.

Social media was pretty much just a hobby interest for me up until became a major service, and also my connection grew through social software and social media becoming one of my main research areas when I joined DERI, a large Semantic Web research institute based in NUI Galway, Ireland. I also chaired the premier social media research conference, ICWSM, in Dublin last year.

2) You recently spoke at The European Intersectoral Summit on Research and Innovation 2013, on the importance of using social media to promote research and innovation to a wider audience. How has social media opened up opportunities to scientific researchers?

Social media is providing new opportunities for researchers to disseminate their research, but also for them to become aware of peers’ research, and to find interdisciplinary topics (through status updates from people in other disciplines) so as to explore possibilities for cross-domain collaboration. It doesn’t have to be just a paper that a researcher shares through social media – it could be a research presentation, an explanatory video, or a blog post about one’s work.

An interesting emerging topic in this space is something called Altmetrics which looks at how researchers are being “cited” and referred to through social media rather than through traditional paper references and citations. In fact, these are related, because if someone retweets your post about a slideset that in turn is a presentation you have given about an academic paper, that person or someone else who sees the retweet from that person may well cite your paper in their own academic work.

3) What advice would you give to those in the research community who are looking to reap the benefits of social media and engage with a wider audience?

For finding things, I would use a tool like TweetDeck or StreamGlider [I’m also a co-founder] to set up searches for keywords of interest, so you can keep abreast of both research and industry news on your research topics. I have a Twitter search set up for my research areas, e.g. “semantic web” OR “sem web” OR semanticweb OR semweb, and from that I can have a fair idea of what is going on in this space. It’s easier to keep track of a topic-specific search rather than dipping in and out of the streams of content coming from social sites. Also, join relevant groups on LinkedIn for your topic of interest, as again they are another great source of information from various perspectives, but also a place to talk about your own research.

But for engaging, there is a challenge to get noticed. Huberman, director of HP Social Labs, says that almost anything except attention can be manufactured as a commodity. So to get attention in the first place, the researcher has to think carefully about how they should phrase their message and who would they most like to get the attention of. Rather than regurgitating the title of an academic paper in a status update, how about phrasing a question that will draw (balanced) commentary if possible? Also, can you find some hubs or connectors in your topic who would be interested in sharing your message with their contacts? If it’s a blog post about your research, make sure to mention the leaders in this space (as you would do in an academic paper), and tell them about it via @mentions or even email when sending out your messages.

Timing of your message is also important. Just like any social media marketing, choosing the optimal time is key. I haven’t seen any studies to look at the best times to send out scientific research updates, but in general mid week after mid day is the best time for getting clicks. If you’re in the middle of meetings, you can use a scheduling tool like HootSuite to send it out at a pre-ordained time.

4) Many researchers are starting to use social media to surface important insights, to crowd source information and to analyse data. What do you think are the challenges presented by this new form of information retrieval and analysis?

There are some obvious challenges. The first is that most researchers tend to use data from Twitter, as they can’t get data from Facebook where there are more people. This is mainly because the Twitter API is more open, allowing you to access updates and networks for most users with public profiles, whereas on Facebook you only have access to your direct friends’ data via the API. The second issue is that it has recently been posited by Pew Research that Twitter is not as representative as previously thought, with opinion on Twitter sometimes differing from general public opinion, so some assumptions may be slightly off with respect to political analysis or policy opinions.

Having said that, for getting answers, social media is definitely an extremely valuable and rapid source of information. There’s no better way to crowd source a research idea, or to get insights into a problem you are having. For analysing data, again, being able to quickly draw on a swathe of people who can assist with a task is really useful for a researcher.

5) Can you give an example of who you think is using social media effectively to promote research and innovation?

If I was to pick out a few, I’d think of: MIT Technology Review, who do a great job covering emerging topics and whose social media stream is a combination of stories, staff updates and community-directed messages; William Gunn, head of academic outreach at Mendeley (@mrgunn on Twitter); Zeynep Tufekci (@techsoc), covering social science and technology; the Irish Ben Goldacre, my colleague Brian Hughes, who writes “The Science Bit”, a blog that debunks pseudoscience with actual science.

Web Behaving Badly: Is the Internet Having a Negative Effect on Our Personalities?

Visit the comments section of any online publication or scroll down to the activity beneath a popular YouTube video and you’ll see the full spectrum of human emotion, often expressions of anger, aggression and vitriol. Then there is the dark side of the dark side of the Web: cyberbullying incidents that have, in the worst cases, contributed to the suicide of the victims involved, such as Irish teenagers Erin Gallagher and Ciara Pugsley.

While anti-bullying campaigns are useful for raising awareness and encouraging the reporting of cyberbullying incidents, there is also research being done into why certain deviant behaviour appears to be intensified by virtue of taking place online. Dr John Suler is a professor of psychology at Rider University, New Jersey and specializes in cyberpsychology, which is the study of individual and group behaviour on the Web.

Suler’s paper titled ‘The Online Disinhibition Effect’ – published in 2004 in the journal Cyberpsychology and Behaviour – approaches deviant or anti-social behaviour from the point of view of how safe or free an individual feels in an online setting to express opinions or engage with others in a way that they mightn’t necessarily do in a face-to-face situation. Essentially, this paper is the theory behind the feeling many of us have that some people are “brave” enough to be nasty on an online forum but would never say to a person standing in front of them.

Do all of us behave differently online? This depends on the person says Suler: “Some people online behave very similarly to the way they do in person. Some people may act quite differently.

“People who have underlying needs and emotions that need to be expressed, but cannot be expressed in their “real” life, will be especially tempted to do so online.”

Two of the concepts Suler uses to explain deviant behaviour online are dissociative anonymity and invisibility; it is much easier to be less inhibited when we feel that we can slip away unnoticed and unidentified. It sounds similar to when people get caught up in riots or become part of a group of football hooligans, I suggest to Suler.

“Sure, any situation in the ‘real’ world that involves anonymity and deindividuation (blending with the crowd) is one in which a person might lose their sense of individual responsibility and do things that they wouldn’t do otherwise,” he explains.

One of the most damaging forms of online behaviour is cyberbullying. Can the online disinhibition effect shed some light on what kinds of people engage in cyberbullying?

“Bullying, online or off, is almost always a displacement or acting out of underlying feelings of anger and helplessness.

“All of us might do some of this under the right conditions, but people with a history of abuse and impulsiveness are more likely to engage in this kind of acting out.”

Cyberbullying has an added element of audience. On social networks like Facebook, the victim’s network of friends are also exposed to this. “Unfortunately, there can be a ‘performance’ aspect to bullying in general. Bullies often like to impress people with their supposed strength, especially their minions,” explains Suler.

Generally speaking, he says that it is easier and more tempting for an individual to be anti-social online but it does come down to how restricted or inhibited that individual feels in an offline setting to begin with. There are, however, positive effects to feel less socially restricted online; Suler says that some researchers believe it may result in acts of generosity and altruism.

“One factor contributing to online disinhibition is the tendency to project one’s own thoughts and feelings into the somewhat ambiguous interactions we have online – ambiguous because, especially in text communication, we can’t see or hear other people. As a result, some people might project feelings of sympathy into how they experience others online.”

Check out for more information, short videos and lots of tips about cyberbullying.

Is the European Economy Ready and “Open” for Big Data? Dublin Event Aims to Link Stakeholders

Coinciding with the Irish Presidency of the Council of the European Union, the European Data Forum (EDF) for industry leaders, academics, policy makers, and community organisers will hold its annual meeting in Dublin next month to discuss the challenges and opportunities of (big) data in Europe. The forum hopes to answers important questions such as: is the European economy ready for big data, are relevant stakeholders being reached by the EU-wide data community, and what is Europe providing in terms of innovative business intelligence (BI) solutions?

In an age of exploding data, the management of this data has become a key factor for an organisation’s success, as well as providing a significant advantage over competitors. Teams of data scientists are being employed by large corporations to examine trends in sets of big data, and linked data engineers are gainfully employed in interlinking disconnected datasets that become more meaningful when they are combined together in an interoperable form – including open data published by government bodies and other organisations.

“Data has swept into every industry and business function and is now an important factor of production, alongside labour and capital”, according to a study by the McKinsey Global Institute. “The amount of data in our world has been exploding, and analysing large datasets – so-called big data – will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus.”

Whether you are an oil and gas company trying to deal with data from complex engineering techniques, a publisher trying to stay ahead of social media streams and new content creation models, or a non-government organisation gathering data from hundreds or thousands of individuals in the field, dealing with intense amounts of data is both a challenge and an opportunity. Therefore, a meeting like the European Data Forum should be of interest to a wide range of players.

Presenters at the event include Knut Sebastian Tungland, Chief Engineer for IT at Statoil, and Fiona Williams, Research Director at Ericsson. The program will also feature a panel of Chief Information Officers from Daimler, Telefonica Digital and ATOS. These speakers will discuss topics including research and technology development, training and knowledge transfer, and commercialisation.

“There is a strong industry presence, as can be seen from the program, with some Irish speakers including Gary Evans from EMC, Prof. Mark Ferguson from Science Foundation Ireland, Daragh O’Brien from Castlebridge Associates and Brid Dooley from RTÉ”, according to DERI, NUI Galway‘s Deirdre Lee, local dissemination chair for the event. “We would also like to see a lot of Irish contributions to our Big Data Exhibition Space.”

The event aims to answer three main strategic questions:

  • Is the European economy ready for big data?
    What are the novel innovation and business models for open and linked data? What platforms, marketplaces, policies, and strategies for the development of data ecosystems should be targeted?
  • Is the EU-wide data community reaching out to relevant stakeholders?
    Do the application scenarios and specific technical and non-technical concepts of big data providers and consumers meet demands for themes like smart cities, environmental research, geospatial information, e-science, and social media?
  • What does Europe provide in terms of innovative business intelligence solutions?
    How can the sector best align with European research agendas and roadmaps for science, technology and innovation, in particular in relation to research framework programmes such as Horizon 2020? Which framework conditions are essential for meeting Europe’s ambitions towards a profitable data economy, such as forthcoming standardisation needs and commercialisation opportunities?

Some other topics for discussion at the forum include the challenges and opportunities around big data management, the single steps of the data value chain, technological innovations and innovative business models, as well as the legal framework for a sustainable European data economy.

Registration is free, but space is limited, so you are advised to register early.

Copyright, Fair Use, and the Need for Revenue

The laws of copyright have been abused and misused of late in a desperate attempt by European newspapers and music industry bodies to monetise the inclusion of articles and songs into online content.

In Ireland, a representative body for a cross-section of Irish newspapers attempted to extract royalty payments from individuals who included any link on their own sites to articles published on the web by its newspapers. It even targeted a charity that had included links to articles about it on its website.

The newspaper body quickly reassessed its position after coming under intense media pressure. It decided it would not seek royalty payments where an individual merely linked to a newspaper article and there was no reproduction of images or content from the newspaper by the individual for commercial use.

Ireland is not the only country with copyright problems. The US issues with copyright and intellectual property rights have emerged again with the recent ban on unlocking mobile phones. Since the defeat of SOPA the uncertainty surrounding the use of copyright to stifle freedom of expression and the sharing of ideas has somewhat disappeared.

Under US copyright law, the ‘fair use’ doctrine allows the use of a copyrighted piece of work by a third party in certain circumstances. It takes a practical approach to the third party use and assesses whether it is of a commercial nature; how much of the copyrighted work it reproduces and, most importantly, the effect of the use on the potential value or market for the copyrighted work. While this may not save cell phone users who wish to unlock their phones, it is a useful exception.

Ireland also has a similar exemption under its laws . A copyrighted work is not infringed where it is added to a new work in a secondary manner that does not ‘unreasonably prejudice’ the interests of the copyright owner. In the case of individuals linking to articles published online by newspapers, where the link is just included as part of the article then it is difficult to see how it could unreasonably prejudice the rights of the copyright holder. The request for payment by the newspaper organisation noted above would more than likely have fallen under this exclusion.

Germany unfortunately has no such exemption under its laws. GEMA, a German music industry body, has won a court case to allow it to block the publication of videos on YouTube that include music in the background, many times only as a result of where the video was filmed and not because the filmmaker has intentionally added it. GEMA has requested a per-stream payment of €0.00375 to allow the videos to be shown.

Germany is just another example of the manic attempts by industry to apply copyright in the wrong way. In the Irish context, consider this Have I just infringed on the rights of the paper I just linked to or have I driven content to their site that otherwise would never have clicked into that article? You decide. That link is merely a tiny section of this overall article and, as such, could it really be said to ‘unreasonably prejudice’ the rights of the copyright holder?

Another consideration, particularly with regards to music, is the potential benefit of including a song into another work. Bear with me as I draw a comparison with a recent pop culture phenomenon. How many of us have been subjected to countless videos of individuals dancing ‘Gangham Style’ to the music of South Korean musician PSY? A search of YouTube returns over half a million videos of people jumping around like show ponies.

Where PSY had chosen to use copyright to block individuals from publishing videos featuring his music or, even worse, attempted to charge them for inclusion of his song into the videos, would he have any measure of the international fame he now enjoys?

Obviously I’m not suggesting that internet users be given full reign to use copyrighted works online. It is recommended that owners and defenders of copyrightable works become smarter about how and when they invoke the protections available under the law.

Where an individual copies an article, or an extract from it, directly into an online work without acknowledging the original author then he should be liable for infringement of the author’s rights. On the flip side, where a website links to an article on a newspapers online offering and this helps to drive traffic to that newspaper’s site and promote awareness of its brand, this should be encouraged.

Similarly, where a section of a music video is used by an individual within a video that is posted online, the creator or owner of that piece of music must consider whether this will be of potential benefit to them in driving sales of their song or if it is a genuine infringement of their rights. Where the use of the song is for a non-commercial purpose or is merely incidental to the video, the song’s owner should welcome this rather than try to stamp it out.

It is obvious that newspapers and the music industries are under pressure (in the case of the US mobile industry there attempt to misuse intellectual property rights may just be greed). What does not seem to be obvious to them is the enormous benefits that can accrue from allowing their products to be promoted by individuals across the web. Sure they do not get an immediate swell in their bank accounts by charging a royalty but what they get is something greater – free global advertising and increased brand recognition.

Lisa is a solicitor with Leman Solicitors, Dublin with a fondness for IP and data protection. She is also a journalist and wrote for New Tech Post.

You can follow Lisa on Twitter.

Would You Pay an Ongoing ‘Entertainment’ Tax to use Your Mobile Device?

An Irish Government minister has recently announced that every home in the country will have to pay a TV license fee regardless of whether they have a TV or not. He is quoted as saying;

“In short, everyone benefits from the availability of these services, regardless of how content is accessed or relayed to the public and, therefore, it is my view that the cost should be borne by society as a whole.”

There are two implied premises to his argument: The first is, that people are watching more programming on their mobile devices or computers to the extent that the relationship between broadcaster and viewer has changed in some fundamental way.

Second, that watching television is not only beneficial in itself but is of benefit to society as a whole.

Dealing with the second point first. For the most part Television isn’t even a benefit to the immediate viewer let alone the public at large.

In particular, no child should be left unattended watching television. Once they are away from “Thomas the Tank Engine” and highly specific programmes of a kind especially produced to aid child development, television is nothing but an outright danger. Indiscriminate viewing does them nothing but harm.

Television is also a major health hazard to those old enough to know where the off button is. It is a clearly identifiable contributor to the obesity epidemic which in turn is the leading cause of heart failure and diabetes in the western world. Never mind a whole host of other nasty side effects.

The far more interesting assertion to discuss is the notion that content is being viewed on mobile devices and computers rather than a conventional TV set.

The evidence from the Cisco Visual Networking Index: Global Mobile Data Traffic Forecast Update, 2012–2017 white paper would suggest that the Minister’s assertions may have some basis. (The executive summary is well-worth reading.)

Relevant to this discussion are two findings from the Cisco research:

1.) Smartphones represented only 18 percent of total global handsets in use in 2012, but represented 92 percent of total global handset traffic. In 2012, the typical smartphone generated 50 times more mobile data traffic (342 MB per month) than the typical basic-feature cell phone (which generated only 6.8 MB per month of mobile data traffic).

2.) Two-thirds of the world’s mobile data traffic will be video by 2017. Mobile video will increase 16-fold between 2012 and 2017, accounting for over 66 percent of total mobile data traffic by the end of the forecast period.

<img src="" align="left" vspace="10" hspace="10"The most recent figures from TV Licensing in the UK show that 39% of homes watched TV content on a smartphone while another 14% used a tablet.

However, while there is a certain handiness in being able to view content on mobiles, computers, etc. it is hard to believe that it as remotely a satisfactory an experience as watching content on a proper screen accompanied by proper sound.

Just because you can do something doesn’t necessarily mean you have to or that you will.

Not to be a complete killjoy, I do think the world would be a sorrier place without programmes such as; The Wire, Breaking Bad, BSG, Firefly and so on. The rest, however, is junk.

So why this move? As is usual, one has to follow the money. Television is a ubiquitous service and most households in Ireland have a viewing set for which they already pay a TV license. So the additional money raised from properties that don’t have televisions will be trivial in comparison to the amount that is already being raised.

Technically, it is not a poll tax but it is in effect a tax on just living in a house which all of us need to do. It contains within it the pernicious idea that we now have to pay a tax to the government for no other reason then that we happen to exist. This has nothing to do with income or property rights.

More pernicious still for those of us who actually enjoy and benefit from technology is the identification (if only by approximation at this point) of everyday gadgetry such as mobile phones, tablets and personal computers as being liable for periodic taxation as opposed to taxes paid at the point of sale.

The dubious and unsupported argument for such an imposition is based on the idea that these devices can carry so-called entertainment from publicly funded broadcasters and are the practical objects for conveying the wider benefit that the minister refers to. (It is equally likely that he could have been referring to some sort of magical thinking that conveyed this benefit – hard to know with politicians.)

However, with the sort of figures being talked about in the Cisco report it is going to be hard for any government to turn away from such a potentially abundant source of income. It is just a matter of them figuring out precisely how.