After the Mashed Museum Day at Leicester, I was inspired to try tagging the Scheme’s blogs via the OpenCalais service provided by Reuters (incidentally, the Chairman of our Trustees is rather high up there….) This was done with two plugins created by Dan Grossman:
Both of these plugins need curl to be working on your server, so I spent yesterday getting that activated and I’m now making more use of that for other scripts.
I first used the archive tagger to see what sort of results the tagging came up with, and the results can be found within the attached excel document.
OpenCalais terms entered.
From 400+ posts, c.1300 tags were established and inserted into the blog database. These didn’t digress to greatly from the content included; however there are a number of useless tags – eg phone numbers (18 tags). The system seems relatively good at pulling out personal names, but does sometimes seem to fragment them and tag posts with first name, fullname and surname; the same can be said about some longer quango names – for example Museums, Libraries and Archives Council (gets broken into pieces). Recognition of department names was automatic as shown below:
- Department for Medieval and Later Antiquities
- Department of Archaeology
- Department of Asia
- Department of Classical and Archaeological Studies
- Department of Classics
- Department of Conservation
- Department of Culture
- Department of Museum Studies
- Department of Portable Antiquities & Treasure
- Department of Portable Antiquities and Treasure
- Department of Prehistory & Europe
- Department of Prehistory and Europe
Once the tags have been automatically inserted into the database, it is easy to go through and remove the unwanted ones from the database via the web interface. For an automatic service, I think that it performed pretty well and it is something I am now considering for the database rebuild under Zend Framework. One Museum already makes use of this, and that’s Sydney’s Powerhouse, where Seb’s team always seem to be the innovators. Would be nice if others followed their lead a bit more. At the Museum’s Mash day, Jim O’Donnell from the NMM did something similar with the Yahoo data extraction service and these can be seen on his site. I assume that we’ll see this pushed out on their main site pretty quickly.
The second plugin, for auto suggestion of tags also works pretty well and suggested sensible tags. I didn’t have to reject any and it also speeded up the production process. Therefore, I propose that this seems a valuable service, and you’ll see the tags separated by bullet points below the posts. As Steve showed at the Mash day, you can link these tags into clouds, automatic searching on flickr from the source word. Quite a few possibilities.

