Interview (Part 2): Nova Spivack On The Fragmentation Of The Semantic Web

In the first part of our interview with Nova Spivack, we talked about the struggle for dominance between Facebook, Google and Microsoft. In the second part, we discuss the current state of Semantic Web technology. Nova started Twine with the intention of it being the first consumer Semantic Web application.

Semantic Web technologies hold the promise of delivering a web of Linked Data where information is understandable to computers. So rather than simply moving data around, machines can derive meaning from the requests made of them, and return searches and so on that are more relevant to us than what we are used to at present.

Nova claims that Twine had potential, “The next version, had we been allowed to finish it, would have been a candidate for a killer app. It would have provided a social plus semantic search engine. Basically, the kind of thing that Facebook will probably build in the future.

“We had a system where people could say what they liked. They could share things, they could join communities around interests. We were also indexing the Web, pulling out the semantics, building a faceted semantic index, and combining all that data so that people could say – “here’s what I like” – and basically build their own semantic search engines with each other.”

But that was not to be. Unfortunately, when the time came to raise the next series of funding that would have enabled Nova and his team to really build out the project, the recession hit and Twine subsequently ended up being sold to Evri.

But the vagaries of economic cycles are not the only forces at work that hinder the widespread adoption of Semantic Web technologies. There is an issue with the technologies themselves.

Nova states, “They are technically difficult to understand and use. It’s hard for mainstream developers to really adopt them. There are still a lot of challenges to building applications that use those technologies because of the relative lack of commercial grade tools, particularly around storage. Triple stores have largely been R&D technologies, not really mainstream, primetime-ready products. Traditional databases don’t really handle triples in a performant way. The semantic data storage issue continues to be a huge bottleneck for the whole field.

“You can’t build big semantic applications if you can’t store and manipulate large graphs in an efficient way. It’s one thing if you can wait a week to load your triples into your triple base and then process them. That’s fine, but most real applications need to do that every hour.”

Such unwieldy technologies seem to be collapsing under their own weight, and the only practical and expedient way to move forward is by fragmenting the space and using Semantic Web developments in an ad-hoc, piecemeal, and proprietary form.

In a recent article on this site, Open Graph: A Cosy Corner Of The Web?, we indicated that this would be a very likely outcome. We pointed out that the Open Graph Protocol which Facebook uses is very much a subset of the technological power available.

Nova remarks, “ I think semantic [the Semantic Web] is no longer a frontier. I think, possibly, it is a missed opportunity, but I think it will surface in various ways, probably non-ideal ways, on Google, Facebook and Bing.

“I think that the big search engines could still do it, but they’ve made their bets. Google with Metaweb, which is not compliant with the Semantic Web. Microsoft with Powerset, which is not compliant with the Semantic Web. And Facebook, at the moment, is doing non-compliant things as well. Facebook is probably not storing things in a triple form. They’re probably using some other mechanism. It may not have the complexity of a true graph.”

It may be that the vision of a Semantic Web of Linked Data is fading somewhat. So what can be done to resurrect the dream?

Nova goes on to say, “They’ve [Facebook, Google, Microsoft, and so on] built or bought themselves into positions which don’t indicate that they’re going to be using W3C standards. I think that may be deliberate because by doing proprietary things you get advantages over competitors. But it’s old-fashioned thinking. I think the benefits of being able to leverage a truly open semantic graph – based on open standards and well thought-out semantics that can eventually support reasoning – would enable a generation of smarter applications across the web, and everyone would benefit from that.

“I don’t think semantics can be hoarded. I don’t think they should be hoarded. I think semantics only work when they are shared.

“If your e-mail protocols don’t work with anyone else’s, you really can’t win the e-mail wars. Similarly, with semantics. If your semantics don’t work with anyone else’s semantics, you can’t win the semantic wars.”

Nova Spivack also spoke at the BlogTalk event held in Cork in 2008. BlogTalk is returning to Ireland on 26-27 August; check out the speaker list.

Interview: Nova Spivack On Facebook, Google And Microsoft – Who Will Triumph?

Nova Spivack was an initial angel investor in Klout, a social networks analytics tool, which quantifies people’s interests by measuring their influence on others and also who in turn influences them.

Nova is now working on Live Matrix which is being designed to navigate the Web by time instead of space.

“Basically all these things at the moment are happening in a perpetual present. There’s no sense of time on the Web. What Live Matrix is doing is trying to index what’s happening when at different times.”

With the increasing amount of video on the Web and expansion of such services as Ustream, more and more scheduled events, lectures, sports, and so on are being fed out onto the Web. Therefore it is becoming increasingly important to find out when these items start and finish. As more content appears with a temporal dimension, there is more of a need to find that material.

Nova adds, “The past two decades have been spent on the space dimension of the Web. That is, what’s where? What keywords are on what pages. What people are at what sites. What content is where. But now we’re looking at what’s when. I think that’s a huge, open, uncharted piece of the Web. It’s a big opportunity.”

In the first part of our interview with him, Nova talks about the potential winners and losers as Microsoft, Google and Facebook duke it out for world domination. In the second part of his interview, Nova discusses the fragmentation of Semantic Web technologies and what that means for all of us.

It is clear that Facebook and Google are stepping on one another’s turf and there is no way for them to avoid that happening. Nova thinks that Facebook will eventually triumph, but Google may not be safe in the number two slot.

Facebook will be the dominant player on the Internet because it already has a social networking structure on its side. Nova points out, “Neither Microsoft nor Google has a strong social networking brand that people use. They are both in this difficult position as being viewed by consumers as tools rather than as places. Facebook is a place. It’s a place where people are spending an increasing amount of their time.”

It is a lot easier for Facebook to build a search engine then it is for either Microsoft or Google to build a social network, let alone a social network of 500 million users. Also, Facebook has a social graph, and by using their ‘like’ system, they can gather even more information about what people are interested in and value. If they do it right, they could build a search engine far superior to either Google search or Bing. The only reason Facebook haven’t done it already is because they have been too busy trying to get their house in order.

But coming second is not the same thing as being safe. Google will need to watch themselves. As Nova warns, “If you look at Microsoft… they have people and senior executives who have a lot of experience in looking at a market they want to enter that has another leading player, and going in there and winning from a number two or number three position. In a way that’s what Microsoft is actually best at.”

What happens with these industry giants is very important for all of us. It is more than a drama being played out in the techy news services.

“I think it’s important to everybody who has an internet company and of course every user of the Internet. For those of us in the internet business, I think it’s interesting to see where these are going because it affects who we might partner with or what services we ought to be focusing on for integration of special features. Should we spend a lot of time SEOing into Google or should we spend more time SEOing into Facebook through the likes [system]?”

As it stands now, it is for Facebook to lose. Google and Microsoft are handicapped by being perceived as tools instead of places. Fortunately, they cannot make real use of the data they have acquired about us through Gmail and Hotmail without our permission. But as long as they stay in the game and can use the information and experience that they do have, they will always have a chance.

Interesting times.

The second part of this interview is also available. Nova Spivack also spoke at the BlogTalk event held in Cork in 2008. BlogTalk is returning to Ireland on 26-27 August; check out the speaker list.