Nice video shows how hidden structured data from the Drupal content management system can lead to semantic search

(Cross-posted at

Via Drupal creator Dries Buytaert‘s post entitled RDFa and Drupal and Stéphane Corlosquet‘s post about RDFa and Drupal examples and use cases, there is a really cool video that demonstrates how the structured data that is available in many Drupal deployments (but is difficult to leverage due to HTML representations) can be exposed and leveraged using RDFa semantic data. The video shows deep searches of Drupal data using Yahoo! SearchMonkey and also some visual navigations of this linked data. The possibilities are very exciting, as Dries says:

Google and Yahoo! are getting increasingly hungry for structured data. It is no surprise, because if they could built a global, vertical search engine that, say, searches all products online, or one that searches all job applications online, they could disintermediate many existing companies. […] Hundreds of thousands of Drupal sites contain vast amounts of structured data, covering an enormous range of topics [and these structures] can be associated with rich, semantic meta-data that Drupal could output in its XHTML as RDFa. For example, say we have an HTML textfield that captures a number, and that we assign it an RDF property of ‘price’. Semantic search engines then recognize it as a ‘price’ field. Add fields for ‘shipping cost’, ‘weight’, ‘color’ (and/or any number of others) and the possibilities become very exciting.

The video is here.

This effort has been growing over the past year, since it was championed by Rasmus Lerdorf (the creator of PHP) and proposed by Dries himself at DrupalCon 2008. Based on Stéphane’s roadmap for RDFa in Drupal 7, the video shows some modules that have been developed for Drupal 6 to demonstrate the power of having embedded RDFa representations of Drupal structures. RDFa is currently being integrated into the core of Drupal 7.

There’s a nice line in the video about this embedded data:

It’s machine readable and now we have access to all of the machine-readable fields available to us before. Very quick, very simple, just what RDFa is supposed to be: human readable data [text], formatting data [HTML] and machine-readable data [RDFa] all in the same document, all inline, all describing the same thing.

(See also this great video and deck of slides about the “Practical Semantic Web and Why You Should Care” by Boris Mann from DrupalCon 2009.)


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s