Interview (Part 2): Nova Spivack On The Fragmentation Of The Semantic Web

In the first part of our interview with Nova Spivack, we talked about the struggle for dominance between Facebook, Google and Microsoft. In the second part, we discuss the current state of Semantic Web technology. Nova started Twine with the intention of it being the first consumer Semantic Web application.

Semantic Web technologies hold the promise of delivering a web of Linked Data where information is understandable to computers. So rather than simply moving data around, machines can derive meaning from the requests made of them, and return searches and so on that are more relevant to us than what we are used to at present.

Nova claims that Twine had potential, “The next version, had we been allowed to finish it, would have been a candidate for a killer app. It would have provided a social plus semantic search engine. Basically, the kind of thing that Facebook will probably build in the future.

“We had a system where people could say what they liked. They could share things, they could join communities around interests. We were also indexing the Web, pulling out the semantics, building a faceted semantic index, and combining all that data so that people could say – “here’s what I like” – and basically build their own semantic search engines with each other.”

But that was not to be. Unfortunately, when the time came to raise the next series of funding that would have enabled Nova and his team to really build out the project, the recession hit and Twine subsequently ended up being sold to Evri.

But the vagaries of economic cycles are not the only forces at work that hinder the widespread adoption of Semantic Web technologies. There is an issue with the technologies themselves.

Nova states, “They are technically difficult to understand and use. It’s hard for mainstream developers to really adopt them. There are still a lot of challenges to building applications that use those technologies because of the relative lack of commercial grade tools, particularly around storage. Triple stores have largely been R&D technologies, not really mainstream, primetime-ready products. Traditional databases don’t really handle triples in a performant way. The semantic data storage issue continues to be a huge bottleneck for the whole field.

“You can’t build big semantic applications if you can’t store and manipulate large graphs in an efficient way. It’s one thing if you can wait a week to load your triples into your triple base and then process them. That’s fine, but most real applications need to do that every hour.”

Such unwieldy technologies seem to be collapsing under their own weight, and the only practical and expedient way to move forward is by fragmenting the space and using Semantic Web developments in an ad-hoc, piecemeal, and proprietary form.

In a recent article on this site, Open Graph: A Cosy Corner Of The Web?, we indicated that this would be a very likely outcome. We pointed out that the Open Graph Protocol which Facebook uses is very much a subset of the technological power available.

Nova remarks, “ I think semantic [the Semantic Web] is no longer a frontier. I think, possibly, it is a missed opportunity, but I think it will surface in various ways, probably non-ideal ways, on Google, Facebook and Bing.

“I think that the big search engines could still do it, but they’ve made their bets. Google with Metaweb, which is not compliant with the Semantic Web. Microsoft with Powerset, which is not compliant with the Semantic Web. And Facebook, at the moment, is doing non-compliant things as well. Facebook is probably not storing things in a triple form. They’re probably using some other mechanism. It may not have the complexity of a true graph.”

It may be that the vision of a Semantic Web of Linked Data is fading somewhat. So what can be done to resurrect the dream?

Nova goes on to say, “They’ve [Facebook, Google, Microsoft, and so on] built or bought themselves into positions which don’t indicate that they’re going to be using W3C standards. I think that may be deliberate because by doing proprietary things you get advantages over competitors. But it’s old-fashioned thinking. I think the benefits of being able to leverage a truly open semantic graph – based on open standards and well thought-out semantics that can eventually support reasoning – would enable a generation of smarter applications across the web, and everyone would benefit from that.

“I don’t think semantics can be hoarded. I don’t think they should be hoarded. I think semantics only work when they are shared.

“If your e-mail protocols don’t work with anyone else’s, you really can’t win the e-mail wars. Similarly, with semantics. If your semantics don’t work with anyone else’s semantics, you can’t win the semantic wars.”

Nova Spivack also spoke at the BlogTalk event held in Cork in 2008. BlogTalk is returning to Ireland on 26-27 August; check out the speaker list.


5 thoughts on “Interview (Part 2): Nova Spivack On The Fragmentation Of The Semantic Web

  1. I guess there’s two routes this could take – services take the piecemeal approach and implement bits that suit them, and that’d be it, or they implement subsets of open standards that can then be ramped up to something bigger.At least with Facebook’s OGP, it’s a start in the right direction – if developers get comfortable ‘typing’ pages (events, people, movies) and making stuff more compliant with the ‘like’ system, they can then be encouraged to add more fine-grained metadata about the objects on their pages.Similarly, for rich snippets, more levels of metadata than just ratings or addresses could be added to enhance search results. This isn’t a full semantic graph, but can be a move in the right direction. Start with some properties of an object for enhancing search results, then show how link to related objects.Nova, you’ve implied elsewhere that Freebase is a means to an end of more rich snippets for Google – should it instead be a nucleus for a Google semantic graph like the Linked Open Data cloud?


  2. I think we are following the same development path as the web -Companies jumped on HTML as they saw business opportunities. We had proprietary extensions all over the place, followed by a realisation that standards were important for continued growth.It is early days in the Semantic Web, there is a land grab, and eventually all players will realise they need to work together – and so standards will become important.


  3. RDF is an interchange and interoperability framework; what you do in the privacy of your own database is your own business! Freebase published RDF views of their data, Google’s Social Graph API consumes FOAF/RDF using a real RDF parser, and Facebook’s RDFa Open Graph Protocol has a schema that maps to popular RDF vocabularies. Things are not so bleak…


  4. Nova,We should catch you up on where things stand wrt triplestore technology, at least for AllegroGraph. I think some of the other triplestore vendors would agree that our respective products have moved beyond R&D Technologies.As for usability and understanding, we offer a product, “Gruff”, that includes the AllegroGraph triplestore. With just a few clicks you have a triplestore installed, data loaded, and you are exploring the graph in your data. If you are so inclined you can try a SPARQL query and even learn to build a SPARQL query with the visual SPARQL generator.You can access AllegroGraph from your favorite programming language: Java, Python, Perl, Ruby, C#, Clojure, Lisp. So we make usability as easy as other technologies.Gruff –


  5. Mark,what you are writing is true. Companies jumped on HTML creating their own proprietary extensions BUT developed applications that, for those times, were great. MS hasn’t done yet anything with Powerset (maybe because they found out too late that they bought an immature and incomplete technology) and it is not obvious yet what Google will do with Metaweb. FB could be on the path to implementing the first really mainstream semantic web like application. If more great apps follow then it is very possible that there will be a run towards the implementation of standards.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s