Simplified Technorati search

20071130a.png I think I’m probably not alone when I say that Technorati may have put too much in at once when they decided to move from focussing on blogs to many other types of tagged content some time ago.

Luckily, they also implemented a stripped-down version for those who remember and just want to use Technorati as a “Google for blogs”. Its available at search.technorati.com. By the way, this isn’t new news: it has been there for six months, but the link is kind of hidden on the “Advanced Search” page.

WebCamp workshop on social network portability

20071127c.png

A WebCamp “Social Network Portability” workshop has been announced to be co-located with BlogTalk on 2nd March 2008. You can view the wiki page for this event.

“Social network portability” is a term that has been used to describe the ability to reuse one’s own profile and contacts across various social networking sites and social media applications. At this workshop, presentations will be combined with breakout sessions to discuss all aspects of portability for social networking sites (including accounts, friends, activities / content, and applications).

Topics of relevance include, but are not limited to, social network centralisation versus decentralisation, OpenSocial, microformats including XHTML Friends Network (XFN) and hCard, authentication and authorisation, OpenID single sign-on, Bloom filters, categorising friends and personas, FOAF, ownership of your published content, SIOC, the OpenFriend format, the Social Network Aggregation Protocol (SNAP), aggregation and privacy, permissions and context, and the Extensible Messaging and Presence Protocol (XMPP).

You can register for this workshop in conjunction with BlogTalk 2008. If you are interested in speaking or otherwise participating in the workshop, please add your name under the Speakers or Participants headings on the wiki page at http://webcamp.org/SocialNetworkPortability.

Final call for social software conference “BlogTalk”

20071127a.png BlogTalk 2008, the 5th International Conference on Social Software, will be held in Cork, Ireland on 3rd/4th March 2008. The event is designed to allow dialogue between practitioners, developers and academics who are involved in the area of social software (blogs, wikis, forums, IM, social networks, etc.). A workshop on Social Network Portability will also be co-located with the event.

The organisers (me included) have just sent the final call for proposals to present at BlogTalk 2008. The end date for submissions is 7th December 2007, and these should be over two pages in length (no fixed template). Reviews will be completed by the end of December 2007, and the organisers will notify successful authors in early January 2008.

You can view the full call for proposals at http://2008.blogtalk.net/proposals and can submit your proposals at http://www.easychair.org/blogtalk2008

As well as peer-reviewed proposals, BlogTalk 2008 will have a number of prominent invited speakers (including Rashmi Sinha of SlideShare and Nova Spivack of Radar Networks, with others to be confirmed).

20071127b.png Mashable recently listed BlogTalk as one of the top 10 events for bloggers in 2008, and you can register for BlogTalk 2008 with Mashable’s 10% discount code “mashtalk”.

Newstalk: Government to use social networking websites to highlight dangers of drugs following Waterford house party

From Ireland’s Newstalk radio news today:

The Government’s planning on using Facebook and Bebo to try to warn people of the dangers of illegal drugs.

The news comes as two people remain in a critical condition in hospital, and more than a dozen others were treated after a 21st birthday party in Waterford.

The Minister for State with responsibility for the National Drugs Strategy says it’s alarming that so many people appeared to have taken drugs at the celebration.

Will OpenSocial enable social network portability?

What does OpenSocial offer in the way of being able to transfer your social network profile and contacts across networks? It’s the question that has been on many lips, from Tim O’Reilly to Tantek Celik, but a hint that this may in fact be possible recently appeared in the OpenSocial documentation for hosting applications with this statement:

Usually your SPI will connect to your own social network, so that an OpenSocial app added to your website automatically uses your site’s data. However, it is possible to use data from another social network as well, should you prefer.

Of course, this will require social networks to enable such functionality, but if done, it could be a step in the right direction.

My reports from the ISWC 2007 and Web 2.0 Expo Tokyo

I’ve written some summaries / transcripts of talks I attended at the recent 6th International Semantic Web Conference in Korea and also the 1st Web 2.0 Expo Tokyo event.

International Semantic Web Conference

Perhaps even better than my transcripts (if you have time) are the video captures of the invited talks, available at videolectures.net: Brewster Kahle, Barney Pell, and Chris Welty (which I missed myself).

Web 2.0 Expo Tokyo

You can also read some other reports from Web 2.0 Expo Tokyo by Rob Cawte (1, 2), Daniel Goldman (1), John Mettraux (1), Yuki Naotori (1), Robert Sanzalone (1, 2), Serkan Toto (1), Nasir Sobri (1, 2) and the rest of the IDD / TeDD team (1, 2, 3, 4, 5). There’s also a short video capture of the end of Evan Williams’ conversation with Tim O’Reilly on YouTube.

Seesmic pre-alpha release

20071122a.png (Sorry for the lack of updates recently. I was attending some conferences for the past two weeks, and I will post links to my reports soon.)

As originally rumoured on Scobleizer and announced on TechCrunch recently, Loic Le Meur‘s newest startup called “Seesmic” is a microvlogging service in the style of sites like Twitter (such that it is being referred to as “the video Twitter”).

Dennis Howlett of ZDnet has written a nice overview of Seesmic which makes the important point that if a picture is worth a thousand words (and a video contains many thousands of pictures), then Seesmic is quite different to Twitter in terms of expressivity and what can be conveyed through even a short video message (when compared to 140 characters).

I tried out Seesmic this morning, and quite liked the simple but intuitive interface for creating content and viewing videos (from the public or from friends). The skin is a little bit “gamesy”, but I’m sure this will be customisable very soon. With an emphasis on using your webcam for creating microvlogs and uploading of short .FLV files, I expect that the gallery of videos shown on the left-hand side will soon benefit or suffer from an imaginative usage of video filters, in-shot props or other attractive techniques to garner people’s attention and clicks.

What is also interesting is that Seesmic has decided to adopt our SIOC ontology as one of their “open platform” formats (along with FOAF and DC). In fact, I like their description of SIOC better than anything I’ve managed to come up with so far: “SIOC is an open data format for community description”. Thanks!

Web 2.0 Expo Tokyo: Eric Klinker – “Web 2.0 and content delivery”

My last report from the Web 2.0 Expo Tokyo event is about the talk by Eric Klinker, chief technical officer for BitTorrent Inc. (I met Eric and his colleague Vincent Shortino briefly on Thursday evening), who gave a talk about “the power of participation”.

The market for IP video is huge, and a Cisco report called the “Exabyte Era” shows that P2P, which currently accounts for 1014 PB of traffic each month, will continue to rise with a 35% year-over-year growth rate. User-contributed computing is happening right now, and is delivering over half of the Internet traffic today.

A new order of magnitude has arrived, the exabyte (EB). One exabyte is 2^60 bytes, which is 1 billion gigabytes. If you wanted to build a website that would deliver 1 EB per month, you would need to be able to transfer at a rate of 3.5 TB/s (assuming 100% network utilisation). 1 EB corresponds to 3,507,000 months or 292,000 years of online TV (stream encoded at 1 MB/s), 64,944 months or 5,412 years of Blu-ray DVD (maximum standard 54 MB/s), 351 months or 29 years of online radio traffic, 20 months or 1.7 years of YouTube traffic, and just one month of P2P traffic.

If you have a central service and want to deliver 1 EB, you would need about 6.5 MB/s peak bandwidth, and 70,000 servers requiring about 60-70 megawatts in total. At a price of $20 per MB/s, it would cost about $130 million to run per month!

The “Web 2.0” way is to use peers to deliver that exabyte. However, not every business is ready to be governed by their userbase entirely. There is an opportunity to take a hybrid model approach. BitTorrent are a content-delivery network that can enable Internet-based businesses to use “the power of participation”. 55 major studios and 10,000 titles are now available via BitTorrent.com (using BitTorrent DNA). Also, the BitTorrent SDK allows BT capability to be added to any consumer electronic device.

He then talked about the Web 2.0 nature of distributed computing, and how we can power something that wouldn’t or couldn’t be powered otherwise. For example, Electric Sheep is a distributed computing application that renders a single frame on your machine for a 30-second long screensaver, which you can then use. Social networks also have a lot of machines, but the best example of distributed computing is search. Google has an estimated 500k to 1M servers, corresponding to $4.5B in cumulative capex (that’s capital expenditure to you and me) or 21% of their Q2 net earnings (according to Morgan Stanley). And yet, search is still not a great experience today, since you still have a hard time finding what you want. Search engines aren’t contextual, they doesn’t see the whole Internet (the “dark web”), they aren’t particularly well personalised or localised, and they aren’t dynamic enough (i.e, they cannot keep up with most Web 2.0 applications [although I’ve noticed that Google is reflecting new posts from my blog quite quickly]).

The best applications involve user participation, with users contributing to all aspects of the application (including infrastructure). Developers need to consider how users can do this (through contributed content, code or computing power). As Eric said, “harness the power of participation, and multiply your ability to deliver a rich and powerful application.”

Web 2.0 Expo Tokyo: Håkon Wium Lie – “The best Web 2.0 experience on any device”

There was a talk at the Web 2.0 Expo Tokyo last Friday afternoon by Håkon Wium Lie, chief technical officer with Opera Software. He has been working on the Web since the early nineties, and is well known for his foundational work on CSS. Opera is headquartered (and Håkon is based) in Norway.

Håkon (pronounced “how come”) started by talking about the Opera browser. Opera has browsers for the desktop, for mobiles and for other devices (e.g., the Nintendo Wii and the OLPC $100 laptop). He thinks that the OLPC machine will be very important (he also brought one along to show us, pictured), and that the browser will be the most important application on this device.

Another product that Opera are very proud of is Opera Mini, which is a small (100k) Java-based browser. Processing of pages takes place via proxy on a fixed network machine, and then a compressed page is sent to Opera Mini.

He then talked about new media types on the Web. Håkon said that video needs to be made into a “first-class citizen” on the Web. At the moment, it takes a lot of “black magic” and third-party plugins and object tags before you can get video to work in the browser for users. There are two problems that need to be solved. In relation to the first problem – how videos are represented in markup – Opera proposed that the <video> element be added to the HTML5 specification. The second problem is in relation to a common video format. The Web needs a baseline format that is based on an open standard. Håkon stated that there is a good candidate in Ogg Theora, which is free of licensing fees, and in HTML5 there may be a soft requirement or recommendation to use this format. He showed some nice mockups of Wikipedia pages with embedded Ogg videos. You can also combine SVG effects (overlays, reflections, filters, etc.) with these video elements.

He then talked about the HTML5 specification: the WHAT working group was setup in 2004 to maintain HTML, and a W3C HTML working group was also established earlier this year. HTML5 will include new parsing rules, new media elements, some semantic elements (section, article, nav, aside), and also some presentational elements will be removed (center, font).

Håkon next described how CSS is also evolving. As an example, he showed us some nice screenshots from the css Zen Garden, which takes a boring document and asks people to apply their stylesheets to change the look. Most of them use some background images to stylize the document (rather than changing the fonts dramatically).

CSS has a number of properties to handle fonts and text on the Web. Browsers have around ten fonts that can be viewed on most platforms (i.e., Microsoft’s core free fonts). But there are a lot more fonts out there, for example, there are 2500 font families available on Font Freak. Håkon says that he wants to see more browsers being able to easily point to and use these interesting fonts. In CSS2, you can import a library of fonts, and he reiterated his hope that fonts residing on the Web will be used more in the future.

Another use for CSS3 is in professional printing. Using the Prince tool, Håkon has co-written a book on CSS using CSS3. CSS3 can allow printing requirements to be specified such as multiple columns, footnotes, leaders, etc.

He then talked about the Acid2 test. Acid2 consists of a single web page, and if a browser renders it correctly, it should show a smiley face. Every element is positioned by some CSS or HTML code with some PNGs. Unfortunately, Internet Explorer performs worst in this test. But I also tested out Firefox 2 and got something distorted that looked like this.

The last thing he talked about was 3D. He gave a nice demo of Opera with some JavaScript that interfaces with the OpenGL engine to render a PNG onto a cube and rotates it. He also showed a 3D snake game from Opera (only a hundred or two lines of code), which is available at labs.opera.com.

I really enjoyed the forward-looking nature of Håkon’s presentation, and said hello briefly afterwards to say thanks for Opera Software’s (via Chaals and Kjetil) involvement in our recent SIOC member submission to the W3C.

Web 2.0 Expo Tokyo: Joe Keller – “Understanding and applying the value of enterprise mashups to your business”

(Another delayed report from a talk last Friday at the Web 2.0 Expo.)

Joe Keller is the marketing officer with Kapow, so I was expecting a marketing talk, but there was a nice amount of technical content to keep most happy. Joe was talking about “getting business value from enterprise mashups”. Kapow started off life as a real-estate marketplace in Europe ten years ago, but moved towards its current focus of mashups after 2002. Referencing Rod Smith, whom I saw last year at BlogTalk 2006, mashups allow content to be generated from a combination of rich interactive applications, do-it-yourself applications plus the current scripting renaissance.

According to McKinsey, productivity gains through task automation have peaked, and the next productivity wave will be data-oriented as opposed to task-oriented. Joe says that Web 2.0 technologies are a key to unlocking this productivity. He also talked about two project types: systematic projects are for conservative reliability, whereas opportunistic projects (or “situational applications” to use the IBM terminology) are for competitive agility. Mashups fit into the latter area.

The term mashup can apply to composite applications, gadgets, management dashboards, ad hoc reporting, spreadsheets, data migration, social software and content aggregation. The components of a mashup are the presentation layer, logic layer, and the data layer (access to fundamental or value-added data). In this space, companies are either operating as mashup builders or mashup infrastructure players like Kapow.

The main value of mashups is in combining data. For example, HousingMaps, the mashup of Google Maps and data from Craig’s List, was one of the first interesting mashups. The challenge is that mashups are normally applied to everyone’s data, but if you’re looking for a house, you may want to filter by things like school district ratings, fault lines, places of worship, or even by proximity to members of your LinkedIn / MySpace network, etc.

He then listed some classes of mashup data sources. In fundamental data, there’s structured data, standard feeds, data that can be subscribed to, basically stuff that’s open to everyone. The value-added data is more niche: unstructured data, individualised data, vertical data, etc. The appetite for data collection is growing, especially around the area of automation to help organisations with this task. The amount of user-generated content (UGC) available is a goldmine of information for companies, enabling them to create more meaningful time series that can be mashed up quickly into applications. According to ProgrammableWeb, there are now something like 400 to 500 mashup APIs available, but there are 140 million websites according to NetCraft, so there is a mismatch in terms of the number of services available to sites.

Kapow aims to turn data into business value, “the right data to the right people at the right time.” Their reputation management application allows companies to find out what is being said about a particular company through blogs, for sentiment analysis. They also provide services for competitive intelligence, i.e., how do you understand the pricing of your competitors in an automated fashion. Asymmetric intelligence is another service they provide for when people are looking for a single piece of information that one person has and no-one else possesses. Business automation is where mashups are being used to automate internal processes, e.g., to counteract the time wasted by “swivel-chair integration” where someone is moving from one browser on one computer to another and back again to do something manually. Finally, opportunistic applications include efforts whereby companies are aiming to make users part of their IT “team”, i.e., by allowing users to have access to data and bringing this into business processes: Web 2.0 infrastructure allows companies to use collective wisdom using Kapow technologies.

About RSS, Joe said that almost every executive in every corporation is starting to mandate what feeds he wants his company to provide (and RSS feeds are growing as quickly as user-generated content in blogs, wikis, etc.). Kapow’s applications allows you to create custom RSS feeds, but he gave a short demo of using Kapow to build an on-the-fly REST service. His service produced the quote for a company’s stock price by extracting identified content from an area of a web page, which could then be incorporated into other applications like an Excel spreadsheet. I asked Joe if it is difficult to educate end users about REST and RSS. He pointed to the ease with which most people can add feeds to iGoogle and said that its becoming easier to explain this stuff to people.

Kapow’s server family allow portal creation, data collection (internal and external), and content migration via mashups which Joe reckons are often more useful than static migration scripts since they can be customised and controlled. Kapow also provide a free “openkapow” API and site for developers to share how they build mashups and feeds.

In summary, Joe gave these take aways:

  • The next business productivity wave will be via data and know-how automation, not routine task automation.
  • Knowledge workers need self-service mashup technology to take advantage of this.
  • Access to critical (value-added) data can create a competitive edge.
  • Web 2.0 technologies complement existing IT systems to maintain the competitive edge.