The Portable Antiquities Scheme joins Pelagios

Hacking Pelagios RDF in the ISAW library, June 2012

Earlier in 2012, the excellent Linked Ancient World Data Institute was held in New York at the Institute for the Study of the Ancient World (ISAW). During this symposium, Leif and Elton convinced many participants that they should contribute their data to the Pelagios project, and I was one of them.

I work for a project based at the British Museum called the Portable Antiquities Scheme which encourages members of the public within England and Wales to voluntarily record objects that they discover whilst pursuing their hobbies (such as metal-detecting or gardening). The centrepiece of this projects is a publicly accessible database which has been on-line in various guises for over 13 years and the latest version is now in the position to produce interoperable data much more easily than previously.

Image of the finds.org.uk database
The Portable Antiquities Scheme database

Within the database that I have designed and built (using Zend Framework, jQuery, Solr and Twitter Bootstrap), we now hold records for over 812,000 objects, with a high proportion of these being Roman coin records (175,000+ at the time of writing, some with more than 1 coin per record). Many of these coins have mints attached (over 51,000 are available to all access levels on our database, with a further 30,000 or so held back due to our workflow model.) To align these mints with a Pleiades place identifier was straightforward due to the limited number of places that are involved, with the simple addition of columns to our database. Where possible, these mints have also been assigned identifiers from Nomisma, Geonames and Yahoo!’s WOEID system (although that might be on the way out with the recent BOSS news), however some mints I haven’t been able to assign – for instance ‘mint moving with Republican issuer‘ or ‘C‘ mint which has an unknown location.

Once these identifiers were assigned to the database, it allowed easy creation of  RDF for use by the Pelagios project and it also facilitated use of their widgets to enhance our site further. To create the RDF for ingestion by Pelagios, our solr search index dumps XML via a cron job cUrl request, which is transformed by XSLT every Sunday night to our server and uses s3sync to send the dump to Amazon S3 (where we have incremental snapshots). These data grow at the rate of around 100 – 200 coins a week, depending on staff time, knowledge and whether the state of the coin allows one to attribute a mint (around 45% of the time.) The PAS database also has the facility for error reporting and commenting on records, so if you use the attributions provided through Pelagios and find a mistake, do tell us!

At some point in the future, I plan to try and match data extracted from natural language processing (using Yahoo geo tools and OpenCalais) against Pleiades identifiers and attempt to make more annotations available to researchers and Pelagios.

For example, this object WMID-3FE965, the Staffordshire Moorlands patera or trulla (shown below):

Has the following inscription with place names:

This is a list of four forts located at the western end of Hadrian’s Wall; Bowness (MAIS), Drumburgh (COGGABATA), Stanwix (UXELODUNUM) and Castlesteads (CAMMOGLANNA). it incorporates the name of an individual, AELIUS DRACO and a further place-name, RIGOREVALI. Which can further be given Pleiades identifiers as such:

  1. Bowness: 89239
  2. Drumburgh: 89151
  3. Stanwix: 967060430
  4. Castlesteads: 89133

Integrating the Pelagios widget and awld.js

Using Pleiades and Nomisma identifers allows the PAS database to enrich records further via the use of rdfa in view scripts and by the incorporation of the Pelagios widget and the ISAW javascript library on a variety of pages. For example, the screenshot below gives a view of a gold aureus of Nero recorded in the North East of England with the Pelagios widget activated:

The pelagios widget embedded on a coin record:
DUR-B4E094 

The javascript library by Nick Rabinowitz and Sebastian Heath also allows for enriched web pages, this page for Nero shows the libary in action:

 

These emperor pages also pull in various resources from third party websites (such as Adrian Murdoch’s excellent talking head video biographies of Roman emperors), data from dbpedia, nomisma, viaf and the site’s internal search engine. The same approach is also used, but in a more pared down way for all other issuer periods on our website, for example: Cnut the Great.


Integrating Johan’s map tiles

Following on from Johan’s posting on the magnificent set of map tiles that he’s produced for the Pelagios project (and as seen in use over at the Pleiades site and OCRE), I’ve now integrated these into our mapping system. I’ve done it slightly differently to the examples that Johan gave; due to the volume of traffic that we serve up, it wasn’t fair to saddle the Pelagios team with extra bandwidth. Therefore, Johan provided zipped downloads of the map tiles and I store these on our server (if you’re a low traffic site, feel free to use our tile store):
Imperium map layer, with parish boundary. Zoom level 10.

The map zoom has been set to the level (10 for Great Britain) at which we decided site security was ensured for the discovery points (although Johan has made tiles available to level 11). This complements the other layers we use:

  • Open Street Map
  • terrain
  • satellite
  • soil map
  • Stamen map watercolor
  • Stamen map toner
  • NLS historic OS maps

Each find spot is also reverse geocoded for a WOEID and Geonames identifier to be produced, elevation to obtained and subsequently we link to Aaron Straup Cope’s excellent woedb for further enhancement of place data.  We also serve up boundaries derived from the Ordnance Survey Opendata BoundaryLine dataset, split from shapefiles and converted to KML by ogr2ogr scripts. The incorporation of this layer allows researchers (over 300 projects currently use our data) to interpret the results that they get from searches on our database against the road network and settlement data much more easily and has already gathered many positive comments from our staff and research colleagues.

By contributing to the Pelagios project, we hope that people will find our resources more easily and that we in turn can promote the efforts of all the fantastic projects that have been involved in this programme. What we’ve managed to implement from joining the Pelagios project already outweighs the time spent coding the changes to our system. If you run a database or website with ancient world references, you should join too!

 

CASPAR seminar series

The Centre for Audio-Visual studies and practice in Archaeology is holding an inaugural series of seminars at the Institute of Archaeology, UCL, 31- 34 Gordon Square on Monday afternoons, starting this coming week. The programme is quite varies and the following speakers are booked to speak:

10 Jan Broadcast archaeology Michael Wood (Story of England, BBC) & Ray Sutcliffe (Chronicle)

17 Jan Producing archaeology on TV Charles Furneaux (Kaboom Film and Television)

24 Jan Archaeology and radio Ben Roberts (The British Museum)

31 Jan Using digital technology to visualise the past Tom Goskar (Wessex Archaeology) and Stuart Eve (UCL)

7 Feb The Google ancient places prokect Leif Isaksen (University of Southampton)

21 Feb Archaeology, television and the public Tim Schadla-Hall & Chiara Bonacchi (UCL)

28 Feb Developing digital communities Andy Bevan and Lorna Richardson (UCL)

7 Mar The Portable Antiquities Scheme Dan Pett (The British Museum)

14 Mar Archaeology, videogames and the public Andrew Gardner (UCL)

21 Mar Where do we go from here Don Henson (Honorary Director of CASPAR)

Enquiries to: Tim Schadla-Hall t.schadla-hall@ucl.ac.uk or Chiara Bonacchi chiara.bonacchi@gmail.

All seminars in room 612 and everyone is welcome. A drinks reception follows each seminar.

Leveraging geodata for enriched records

This post discusses how I’ve been using various geodata tools (principally Yahoo!, but also Flickr shapefiles, Google’s maps and geocoder apis, Geonames, OSdata and I’m now exploring the Unlock project from Edina to see what they can offer as well), for the enrichment of our database. I started writing this post back in May, but as I’ve just spoken at the W3G unconference in Stratford-on-Avon, I thought I’d finish it and get it out. Gary Gale and his helpers produced a very good un-conference, at which I met some very interesting people (shame TW Bell couldn’t come!) and saw some good examples of what other people are up to.

My presentation from that conference is embedded here:


Most of us realise the power of maps and I’ve made them a very central cog of the new Scheme website that we soft launched at the end of March 2010. Hopefully this isn’t too long and boring and has some technical stuff that may be of some use to others. As always, the below is CC-NC-SA.

A map showing all finds recorded by the SchemeAt the Scheme, we’ve been collecting data on the provenance of archaeological discoveries made by the public and publishing it online for 13 years now (much longer than I’ve worked for the Scheme!), and these are collated on our database and provide the basis for spatial interrogation of where and when these objects have been discovered. Many researchers are using the database for a variety of geomatics, for example patterning, cluster analysis etc. A few of our recent AHRC funded PhD candidates have been implementing GIS techniques as one of the integral parts of their research – for example:

  1. Tom Brindle, KCL
  2. Philippa Walton, UCL
  3. Ian Leins, Newcastle University
  4. Katie Robbins, Southampton University

They are all incidentally alumni of the Scheme (and one has rejoined recently!), and have been inspired with their research from working on these data that we collate. I won’t be discussing the philosophical arguments of provenance and its meaning within this article, but demonstrating how we’re using third-party tools to enhance find spot data on our site and talk about some of the problems we face to make use of them. Some of the post will have some code examples, but you can gloss over them if you’re not into that (which the majority of our readers will probably be!)

Find spot data

Our Find Liaison Officers record the majority of objects that we are shown onto our database and ask the finder to provide us with the most accurate National Grid Reference (OSGB36) that they can produce. Many of our finders are now using GPS units to produce grid references (we’re aware of the degree of accuracy/precision they provide, but as most objects aren’t from secure archaeological contexts, the variance won’t affect work that much.) We encourage people to provide these figures to a level of 8 figures and above and this proportion is growing every year. The list below shows the precision of each grid reference length:

  • 0 figure [SP] = 100 kilometre square
  • 2 figure [SP11] = 10 kilometre square
  • 4 figure [SP1212] = 1 kilometre square
  • 6 figure [SP8123123] = 100 metre square

Then we get the figures that are actually of some archaeological use:

  • 8 figure [SP812341234] = 10 metre square
  • 10 figure [SP1234512345] = 1 metre square
  • 12 figure [SP123456123456] = 10 centimetre square

This find spot data is given to us in confidence by the finders and landowners and we therefore have to protect this confidence. We have an agreement with the main providers of our data – the metal detecting community – representative body (The National Council for Metal Detecting), that we won’t publish on-line find spots at a precision higher than a 4 figure national grid reference or to parish level. These grid references can be obscured from public view completely by asking the Finds Liaison Officer to enter a to be “known as” alias on the find spot form at the time of recording (or subsequently).

Converting OSGB36 grid references to Latitude and Longitude pairs

Most of the web mapping programs out there, make use of Latitude and Longitude pairs for displaying point data on their mapping interfaces. Therefore, we now convert all our NGRs to LatLng on our database and these are stored as floats in two columns in our find spots table. Whilst processing these grid references to the LatLng pairs, I also do some further manipulations to produce and insert into our spatial data table:

  1. Grid reference length
  2. Accuracy of grid reference as shown above
  3. Four figure grid reference
  4. 1:25k map reference
  5. 1:10k map reference
  6. Findspot elevation
  7. Where on Earth ID

The PHP code functions to do this are based around some written by the original Oxford ArchDigital team and has some additions by me. There are publicly available code examples by Barry Hunter or Jeff Stott or some other versions out there on the web! My code is used as either a service or view helper in my Zend Framework project and bundles together a variety of functions. I’m not a developer, I’ve just taught myself bits and pieces to get the PAS website back on the road; if you see errors, do let me know or suggest ways to improve the code.

Using Yahoo! to geo-enrich our data

Several years ago, the great Tyler Bell (formerly of Oxford ArchDigital and Yahoo!) gave a paper at an archaeological computing conference at UCL’s Institute of Archaeology, where he broke his joke about XML being like high school sex (won’t elaborate on this, ask him) whilst some toe-rags were trying to steal my push bike (they came off badly as I was standing behind them!) Tyler’s paper gave me much food for thought, and it is over the last year or so, that this idea has really come to fruition with our data. The advent of Yahoo!’s suite of geoPlanet tools has allowed us to do various things to our data set and present it in different ways. Below, I’ll show you some of things their powerful suite of tools has allowed us to do.

Putting dots on maps for finds with just a parish name

Prior to 2003, we often received the majority of our finds with a very vague find spot, often just to parish level. As everyone loves maps and would like to see where these finds came from, I wanted to get a map on every page that needed one.. Previously, our FLOs would be asked to centre a find on the parish for these find spots; this is now such a waste of time when you can use geoPlanet to get a latitude and longitude, Postcode, type of settlement, bounding box and a WOEID to enhance the data that we hold. To do this is pretty simple with the aid of YQL.

I first heard about YQL from Jim O’Donnell (formerly NMM’s web wizard) and then more at the Yahoo! Hack day in London, when I pedalled round London with Andrew Larcombe on the Purple Pedals bikes. Yahoo describe YQL as SELECT * FROM Internet, which is indeed pretty true. Building opentables to use with their system is pretty easy – I’ll write more about some Museum tables in another post soon. So all my geo extraction is performed using YQL and the examples below show how. All of these are done with the public endpoint. If you run a high traffic site, it is definitely worth changing your code to use Oauth and authenticate your YQL calls for the non-public endpoint (better rate limits etc). It is slightly tricky and you do need to work out how to refresh your Yahoo token, but it is worth the effort.

For example, I grew up in Stapleford, Cambridgeshire and you can search for that with the following YQL call:

[HTML]select * from geo.places where text="stapleford,cambridgeshire"[/html]

Which maps to this REST URL of:
http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20geo.places%20where%20text%3D%22stapleford%2Ccambridgeshire%22
producing an XML or JSON response like below (diagnostics omitted):

[XML toolbar="true"]
<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:count="1" yahoo:created="2010-05-05T11:12:49Z" yahoo:lang="en-US">
<results>
<place xmlns="http://where.yahooapis.com/v1/schema.rng" xml:lang="en-US" yahoo:URI="http://where.yahooapis.com/v1/place/35984">
<woeid>35984</woeid>
<placeTypeName code="7">Town</placeTypeName>
<name>Stapleford</name>
<country code="GB" type="Country">United Kingdom</country>
<admin1 code="GB-ENG" type="Country">England</admin1>
<admin2 code="GB-CAM" type="County">Cambridgeshire</admin2>
<admin3/>
<locality1 type="Town">Stapleford</locality1>
<locality2/>
<postal type="Postal Code">CB22 5</postal>
<centroid>
<latitude>52.145329</latitude>
<longitude>0.151490</longitude>
</centroid>
<boundingBox>
<southWest>
<latitude>52.127220</latitude>
<longitude>0.133460</longitude>
</southWest>
<northEast>
<latitude>52.164879</latitude>
<longitude>0.176640</longitude>
</northEast>
</boundingBox>
<areaRank>3</areaRank>
<popRank>1</popRank>
</place>
</results>
</query>
[/xml]

By parsing the XML or JSON response (I tend to use the JSON response),  a Latitude and Longitude pair can be retrieved for placing the object onto the map. It isn’t the true find spot, but can at least give a high level overview of the point of origin. Whilst doing this, I also take the postcode, woeid, bounding box etc to reuse again. Parsing data is pretty simple once you have got your response and decoded the JSON, for example:

[PHP]
$place = $place->query->results->place;
$placeData = array();
$placeData['woeid'] = (string) $place->woeid;
$placeData['placeTypeName'] = (string) $place->placeTypeName->content;
$placeData['name'] = (string) $place->name;
if($place->country){
$placeData['country'] = (string) $place->country->content;
}
if($place->admin1) {
$placeData['admin1'] = (string) $place->admin1->content;
}
if($place->admin2){
$placeData['admin2'] = (string) $place->admin2->content;
}
if($place->admin3){
$placeData['admin3'] = (string) $place->admin3->content;
}
if($place->locality1){
$placeData['locality1'] = (string) $place->locality1->content;
}
if($place->locality2){
$placeData['locality2'] = (string) $place->locality2->content;
}
if($place->postal){
$placeData['postal'] = $place->postal->content;
}
$placeData['latitude'] = $place->centroid->latitude;
$placeData['longitude'] = $place->centroid->longitude;
$placeData['centroid'] = array(
‘lat’ => (string) $place->centroid->latitude,
‘lng’ => (string) $place->centroid->longitude
);
$placeData['boundingBox'] = array(‘southWest’ => array(
‘lat’ => (string) $place->boundingBox->southWest->latitude,
‘lng’ => (string) $place->boundingBox->southWest->longitude),
‘northEast’ => array(
‘lat’ => (string) $place->boundingBox->northEast->latitude,
‘lng’ => (string) $place->boundingBox->northEast->longitude)
);
return $placeData;
[/php]

The image below shows an autogenerated findspot and a parish boundary (see below for flickr shapefile use) and adjacent places.

An autogenerated findspot

Within our database, I have a certainty field for where the co-ordinates originate from. This table has the following content:

  1. From a map
  2. From finder verbally
  3. GPS from the finder
  4. GPS from the FLO
  5. Centred on the parish via a paper map
  6. Recorded at a rally (so certainty could be dubious)
  7. Produced via webservice

Therefore researchers are appraised of where the findspot comes from and whether we can treat it (if at all) as useful.

Getting elevation (via Geonames)

The woeid or the LatLng can be used to get elevation of the find spot. This can be achieved by a combination of reverse geocoding against Flickr place names (for woeid) and the Geonames API call for ‘Elevation – Aster Global Digital Elevation Model’. So for example, I want to get the elevation for the centre of Stapleford. You can query the geonames API with the following YQL:

[HTML wraplines="true"]select * from json where URL="http://ws.geonames.org/astergdemJSON?lat=52.145329&lng=0.151490";[/html]

Which when executed produces this response:

[XML]
<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng"
yahoo:count="1" yahoo:created="2010-09-30T12:26:00Z" yahoo:lang="en-US">
<results>
<json>
<astergdem>17</astergdem>
<lng>0.15149</lng>
<lat>52.145329</lat>
</json>
</results>
</query>
[/xml]

So I now have the elevation of 17 metres above sea level. Great! I’ve been experimenting a bit with this against some findspots that we know elevation for. One high profile object, the Crosby Garrett Helmet was pinpointed to 1 metre difference in the GPS elevation and the Geonames sourced one.

By providing an elevation for each of our findspots, researchers can then do viewshed analysis; I don’t think anyone has really done this yet for the artefact distributions that we record, but I could be proved wrong!

Reverse geocoding from Latitude and Longitude with Yahoo!

At present, the GeoPlanet suite doesn’t provide this feature, but you can still manage to do this via YQL and using the following query:

[HTML]select * from flickr.places where lon={Longitude} and lat={latitude}[/html]

So for example using Stapleford’s LatLng as the YQL parameters gives you:

[XML wraplines="true"]
<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:count="1" yahoo:created="2010-05-05T01:27:47Z" yahoo:lang="en-US">
<results>
<places accuracy="16" latitude="52.145329" longitude="0.151490" total="1">
<place latitude="52.145" longitude="0.151" name="Stapleford, England, United Kingdom" place_id="m2G8tyiaBJVjFQ" place_type="locality" place_type_id="7" place_url="/United+Kingdom/England/Stapleford/in-Cambridgeshire" timezone="Europe/London" woeid="35984"/>
</places>
</results>
</query>
[/xml]

You’ll notice a couple fo useful things in the Flickr XML returned, for example the place_url, in this case: /united+kingdom/england/stapleford/in-cambridgeshire  which when appended to Flickr’s root URL for photos can give you http://www.flickr.com/places/united+kingdom/england/stapleford/in-cambridgeshire which in turn gives you access to feeds in various flavours from that page.

One of the other cool things available in Flickr’s API is placeinfo. I’d love a boundary map of how Flickr views the parish of Stapleford. As I previously obtained and gave my findspot a WOEID, I can see if Flickr has this data. So perform this YQL query:

[HTML]select * from flickr.places.info where woe_id=’35984′[/html]

And execute it to obtain the following XML:

[XML wraplines="true"]
<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng"
yahoo:count="1" yahoo:created="2010-09-30T12:34:50Z" yahoo:lang="en-US">
<results>
<place has_shapedata="1" latitude="52.145" longitude="0.151"
name="Stapleford, England, United Kingdom"
place_id="m2G8tyiaBJVjFQ" place_type="locality"
place_type_id="7"
place_url="/United+Kingdom/England/Stapleford/in-Cambridgeshire"
timezone="Europe/London" woeid="35984">
<locality latitude="52.145" longitude="0.151"
place_id="m2G8tyiaBJVjFQ"
place_url="/United+Kingdom/England/Stapleford/in-Cambridgeshire" woeid="35984">Stapleford, England, United Kingdom</locality>
<county latitude="52.373" longitude="0.007"
place_id="pVJUVwKYA5qQZa9wqQ"
place_url="/pVJUVwKYA5qQZa9wqQ" woeid="12602140">Cambridgeshire, England, United Kingdom</county>
<region latitude="52.883" longitude="-1.974"
place_id="pn4MsiGbBZlXeplyXg"
place_url="/United+Kingdom/England" woeid="24554868">England, United Kingdom</region>
<country latitude="54.314" longitude="-2.230"
place_id="DevLebebApj4RVbtaQ"
place_url="/United+Kingdom" woeid="23424975">United Kingdom</country>
<shapedata alpha="0.00015" count_edges="16"
count_points="44" created="1248244568" has_donuthole="0" is_donuthole="0">
<polylines>
<polyline>52.155731201172,0.17115999758244 52.158447265625,0.17576499283314 52.159084320068,0.18161700665951 52.159244537354,0.18208900094032 52.15747833252,0.18410600721836 52.153221130371,0.18645000457764 52.151500701904,0.17897999286652 52.14905166626,0.17045900225639 52.136436462402,0.15260599553585 52.135303497314,0.14247800409794 52.140232086182,0.13955999910831 52.145477294922,0.14135999977589 52.145721435547,0.14150799810886 52.145240783691,0.14707000553608 52.154125213623,0.16043299436569 52.155731201172,0.17115999758244</polyline>
</polylines>
<URLs>
<shapefile>http://farm4.static.flickr.com/3483/shapefiles/35984_20090722_6d95b5e27e.tar.gz</shapefile>
</urls>
</shapedata>
</place>
</results>
</query>
[/xml]

Brilliant, the polylines can be used draw an outline shapefile on the map.

Extracting place data from find descriptions

Y!Geo tagsMany of our objects are tied by descriptive prose to various places around the World. By using Yahoo’s Placemaker, we can now extract the entities from the finds data and allow for cross referencing of all objects that have Avon, England within their description. The image below shows you where you’ll see the tags displayed on the finds record, as I’m into Classics, you’ll notice I label these with lower case Greek letters for bullets. Probably pretentious!  To get these tags is really very straightforward and can use another pretty simple YQL call, for example, this text is from the famous Moorlands Patera.

[HTML toolbar="true" wraplines="true"]
SELECT * FROM geo.placemaker WHERE documentContent = "Only two other vessels with inscriptions naming forts on Hadrian’s Wall are known; the ‘Rudge Cup’ which was discovered in Wiltshire in 1725 (Horsley 1732; Henig 1995) and the ‘Amiens patera’ found in Amiens in 1949 (Heurgon 1951). Between them they name seven forts, but the Staffordshire patera is the first to include Drumburgh and is the only example to name an individual. All three are likely to be souvenirs of Hadrian’s Wall, although why they include forts on the western end of the Wall only is unclear" AND documentType="text/plain"[/html]

Which then produces this output in XML:

[XML toolbar="true"]
<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:count="1" yahoo:created="2010-05-05T04:26:57Z" yahoo:lang="en-US">
<results>
<matches>
<match>
<place xmlns="http://wherein.yahooapis.com/v1/schema">
<woeId>575961</woeId>
<type>Town</type>
<name><![CDATA[Amiens, Picardie, FR]]></name>
<centroid>
<latitude>49.8947</latitude>
<longitude>2.29316</longitude>
</centroid>
</place>
<reference xmlns="http://wherein.yahooapis.com/v1/schema">
<woeIds>575961</woeIds>
<start>177</start>
<end>183</end>
<isPlaintextMarker>1</isPlaintextMarker>
<text><![CDATA[Amiens]]></text>
<type>plaintext</type>
<xpath><![CDATA[]]></xpath>
</reference>
<reference xmlns="http://wherein.yahooapis.com/v1/schema">
<woeIds>575961</woeIds>
<start>201</start>
<end>207</end>
<isPlaintextMarker>1</isPlaintextMarker>
<text><![CDATA[Amiens]]></text>
<type>plaintext</type>
<xpath><![CDATA[]]></xpath>
</reference>
</match>
<match>
<place xmlns="http://wherein.yahooapis.com/v1/schema">
<woeId>12602186</woeId>
<type>County</type>
<name><![CDATA[Wiltshire, England, GB]]></name>
<centroid>
<latitude>51.3241</latitude>
<longitude>-1.9257</longitude>
</centroid>
</place>
<reference xmlns="http://wherein.yahooapis.com/v1/schema">
<woeIds>12602186</woeIds>
<start>123</start>
<end>132</end>
<isPlaintextMarker>1</isPlaintextMarker>
<text><![CDATA[Wiltshire]]></text>
<type>plaintext</type>
<xpath><![CDATA[]]></xpath>
</reference>
</match>
<match>
<place xmlns="http://wherein.yahooapis.com/v1/schema">
<woeId>12602189</woeId>
<type>County</type>
<name><![CDATA[Staffordshire, England, GB]]></name>
<centroid>
<latitude>52.8248</latitude>
<longitude>-2.02817</longitude>
</centroid>
</place>
<reference xmlns="http://wherein.yahooapis.com/v1/schema">
<woeIds>12602189</woeIds>
<start>276</start>
<end>289</end>
<isPlaintextMarker>1</isPlaintextMarker>
<text><![CDATA[Staffordshire]]></text>
<type>plaintext</type>
<xpath><![CDATA[]]></xpath>
</reference>
</match>
<match>
<place xmlns="http://wherein.yahooapis.com/v1/schema">
<woeId>23509175</woeId>
<type>LandFeature</type>
<name><![CDATA[Hadrian's Wall, Bardon Mill, England, GB]]></name>
<centroid>
<latitude>54.9522</latitude>
<longitude>-2.32975</longitude>
</centroid>
</place>
<reference xmlns="http://wherein.yahooapis.com/v1/schema">
<woeIds>23509175</woeIds>
<start>418</start>
<end>432</end>
<isPlaintextMarker>1</isPlaintextMarker>
<text><![CDATA[Hadrian’s Wall]]></text>
<type>plaintext</type>
<xpath><![CDATA[]]></xpath>
</reference>
</match>
</matches>
</results>
</query>
[/xml]

In the above XML response, you can now see the matches that Placemaker has found in the text sent to their service. You can now parse this data and use it for tagging or any other purpose that you want to put the data to.  YQL has the added benefit of caching at the Yahoo! end and you can do multiple queries in one call as demonstrated by Chris Heilmann in his Geoplanet explorer.

A YQL Multiquery example

For example, I want to combine a placemaker call and also get some spatial information for a find spot where I only have the placename. To do this, I write this YQL query:

[HTML toolbar="true" wraplines="true"]
select * from query.multi where queries=’
select * from geo.placemaker where documentContent = "Only two other vessels with inscriptions naming forts on Hadrian’s Wall are known the Rudge Cup which was discovered in Wiltshire in 1725 (Horsley 1732, Henig 1995) and the Amiens patera found in Amiens in 1949 (Heurgon 1951). Between them they name seven
forts, but the Staffordshire patera is the first to include Drumburgh and is the only example to name an individual. All three are likely to be souvenirs of Hadrian’s Wall, although why they include forts on the western end of the Wall only is unclear" and documentType="text/plain" and appid="";
select * from geo.places where text="staffordshire moorlands, staffordshire,uk" ‘
[/html]

The base URL to call this is the public version – http://query.yahooapis.com/v1/public/yql and as we are using one of the community tables, the call needs to be made with &env=store://datatables.org/alltableswithkeys appended (urlencoded).
This can be run in the console and produces the following XML response – 4 place matches and the geo data for Staffordshire Moorlands.

[XML toolbar="true" wraplines="true"]
<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng"
yahoo:count="2" yahoo:created="2010-09-30T11:44:08Z" yahoo:lang="en-US">
<results>
<results>
<matches>
<match>
<place xmlns="http://wherein.yahooapis.com/v1/schema">
<woeId>575961</woeId>
<type>Town</type>
<name><![CDATA[Amiens, Picardie, FR]]></name>
<centroid>
<latitude>49.8947</latitude>
<longitude>2.29316</longitude>
</centroid>
</place>
<reference xmlns="http://wherein.yahooapis.com/v1/schema">
<woeIds>575961</woeIds>
<start>196</start>
<end>202</end>
<isPlaintextMarker>1</isPlaintextMarker>
<text><![CDATA[Amiens]]></text>
<type>plaintext</type>
<xpath><![CDATA[]]></xpath>
</reference>
</match>
<match>
<place xmlns="http://wherein.yahooapis.com/v1/schema">
<woeId>12602186</woeId>
<type>County</type>
<name><![CDATA[Wiltshire, England, GB]]></name>
<centroid>
<latitude>51.3241</latitude>
<longitude>-1.9257</longitude>
</centroid>
</place>
<reference xmlns="http://wherein.yahooapis.com/v1/schema">
<woeIds>12602186</woeIds>
<start>120</start>
<end>129</end>
<isPlaintextMarker>1</isPlaintextMarker>
<text><![CDATA[Wiltshire]]></text>
<type>plaintext</type>
<xpath><![CDATA[]]></xpath>
</reference>
</match>
<match>
<place xmlns="http://wherein.yahooapis.com/v1/schema">
<woeId>12602189</woeId>
<type>County</type>
<name><![CDATA[Staffordshire, England, GB]]></name>
<centroid>
<latitude>52.8248</latitude>
<longitude>-2.02817</longitude>
</centroid>
</place>
<reference xmlns="http://wherein.yahooapis.com/v1/schema">
<woeIds>12602189</woeIds>
<start>271</start>
<end>284</end>
<isPlaintextMarker>1</isPlaintextMarker>
<text><![CDATA[Staffordshire]]></text>
<type>plaintext</type>
<xpath><![CDATA[]]></xpath>
</reference>
</match>
<match>
<place xmlns="http://wherein.yahooapis.com/v1/schema">
<woeId>23509175</woeId>
<type>LandFeature</type>
<name><![CDATA[Hadrian's Wall, Bardon Mill, England, GB]]></name>
<centroid>
<latitude>54.9522</latitude>
<longitude>-2.32975</longitude>
</centroid>
</place>
<reference xmlns="http://wherein.yahooapis.com/v1/schema">
<woeIds>23509175</woeIds>
<start>413</start>
<end>427</end>
<isPlaintextMarker>1</isPlaintextMarker>
<text><![CDATA[Hadrian&rsquo;s Wall]]></text>
<type>plaintext</type>
<xpath><![CDATA[]]></xpath>
</reference>
</match>
</matches>
</results>
<results>
<place xmlns="http://where.yahooapis.com/v1/schema.rng"
xml:lang="en-US" yahoo:URI="http://where.yahooapis.com/v1/place/12696078">
<woeid>12696078</woeid>
<placeTypeName code="10">Local Administrative Area</placeTypeName>
<name>Staffordshire Moorlands District</name>
<country code="GB" type="Country">United Kingdom</country>
<admin1 code="GB-ENG" type="Country">England</admin1>
<admin2 code="GB-STS" type="County">Staffordshire</admin2>
<admin3/>
<locality1/>
<locality2/>
<postal/>
<centroid>
<latitude>53.071468</latitude>
<longitude>-1.993490</longitude>
</centroid>
<boundingBox>
<southWest>
<latitude>52.916691</latitude>
<longitude>-2.211330</longitude>
</southWest>
<northEast>
<latitude>53.226250</latitude>
<longitude>-1.775660</longitude>
</northEast>
</boundingBox>
<areaRank>6</areaRank>
<popRank>0</popRank>
</place>
</results>
</results>
</query>
[/xml]

Concordance with other services

One of the other things I am interested in, is finding concordance between WOEID and Geonames places. This is quite easy to do using another geo table. For example look up Amiens, Picardie by WOEID:

[HTML]select * from geo.concordance where namespace="woeid" and text="575961"[/html]

Produces:

[XML]
<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng"
yahoo:count="1" yahoo:created="2010-09-30T12:41:28Z" yahoo:lang="en-US">
<results>
<concordance xml:lang="en-US"
xmlns="http://where.yahooapis.com/v1/schema.rng"
xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:URI="http://where.yahooapis.com/v1/concordance/woeid/575961">
<woeid>575961</woeid>
<geonames>3037854</geonames>
<locode>FRAMI</locode>
</concordance>
</results>
</query>
[/xml]

So you now have a WOEID and a geonames ID. Amiens WOEID = 575961 and Geonames = 3037854. You can use the geonames id that is produced for linked data; for example: http://ws.geonames.org/rdf?geonameId=3037854

Problems using YQL for geodata

Even though combining YQL with the power of Geoplanet is awesome, I did run into a few problems. None of these were really insurmountable:

  1. Hit rate limit constantly – Google’s indexing of our site was causing our server to make too many requests to YQL; fixed by changing caching model and switching to Oauth endpoint. Also I changed my code to ignore responses when the headers returned were: text/html;charset=UTF-8. The rate limit page thrown up by Yahoo! is HTML and not an XML response.
  2. Some places were pulled out of text when they were irrelevant – Copper Alloy, Tamil Nadu is one example. Fixed by creating a stop list
  3. geonames API sometimes takes a while to respond and made application hang – changed cUrl settings
  4. Took quite a long time to parse 400,000 records – can’t do much with that!

However, I’d really recommend using YQL to extract geodata for your application. Hopefully, Yahoo! will maintain YQL and Geo as integral parts of their business model…. In the future, I would love to run the British Museum collections data through these functions and see what cross-referencing I could find….

Six month review of new website performance

The Crosby Garrett Helmet

The Scheme’s new website has been online now for 6 months and I’ve been looking at the performance and costs incurred during this period. We’ve had several large discoveries since the site went live – the Frome Hoard and the Crosby Garrett Roman Helmet for instance. However, they aren’t typical objects so we don’t get the big spikes in referral from large news aggregators or providers daily. I’m a little disappointed that web traffic hasn’t grown significantly since we went live with the new site, but we’re still getting  a long period of activity/ pages viewed per visit. I’ve worked hard on search engine visibility (apart for a blip in July when I blocked all search engines via a typo in my robots.txt file – as the great Homer says, D’oh!) and we’re now seeing a surge in pages being added to Google’s index (nearly up to 50% of 400,000 publicly accessible pages now included according to webmaster tools).

Web statistics

All the web statistics are produced via Google Analytics, I haven’t bothered with the old logfile analysis.  The old stats that we used to return for the DCMS and quoted in our annual reports were heavily reliant on ‘hits’, a metric I always hated.  Some simple observations:

  • We get a trend of heavy weekday usage, with noticeable dips at weekends when recording isn’t as prevalent.

    Typical weekly pattern

    Typical weekly pattern

  • We don’t get a huge audience, our topic is pretty niche, but hopefully it will keep increasing.
  • Overall visitors average 10mins 59 seconds on site and view nearly 14 pages a visit, with a bounce rate of 37.46%.
  • Those visits that are mainly within the confines of the database module average 21 pages per visit and around 17 mins 17 seconds, with a bounce rate of 23.62%
  • We had significant surges in traffic on the days that Frome and Crosby Garrett were announced (8th July and 14th September)
  • 14% of our users who are stuck with IE use version 6. Guess where the majority of these poor people are based… Government sector offices.
  • 149 countries are representing as having visited; the top two countries are the UK (76% of total) and USA (7%) which account. I assume, this is mainly because our subject material is mainly centred on England & Wales. It would be great if we could penetrate the archaeological syllabus in other countries as we have such a mass of data to play with.
  • We have now consolidated our domains down to one so our previous webstats definitely gave a false measure of usage of our resources.
  • We have 66 partner organisations and 3 main funders/hosting organisations; MLA, British Museum and DCMS. We get very little referral traffic from any of these as shown here: BM – 2,495 referrals, MLA – 64 referrals, DCMS – 60 referrals. I think it is a shame a flagship project doesn’t get more click through, but then it is hard to position us higher on these sites as there is so much culture to promote. However, MLA’s description of our project is rather out of date.
  • Google accounts for 45.88% of the originating point for traffic to our site, Yahoo for 0.84% and Bing 0.80%
  • In the 6 month period, we have had 130,235 visits;  1,788,580 page views; 65,531 Visitors. Compared to the same period last year, (which coincidentally ends with the day that the Staffordshire Hoard was announced, we had 95,902 visits; 514,341 page views; 57,990 visitors – all figures for finds.org.uk and for findsdatabase.org.uk we had: 48,786 visits; 1,185,537 page views; 17,767 visitors). All these figures are devoid of usage of XML/JSON/KML functions and feeds.
  • A more detailed breakdown can be found in this PDF

New functions

Since launch, we’ve  released lots of new features, all based on Zend framework code:

  • More extensive mining of theyworkforyou for Parliamentary data
  • Heavy use of YQL throughout the website
    • Flickr images pulled in
    • Oauth YQL calls to make use of Yahoo! geo functions
  • Created a load of YQL tables for Museum and heritage website API and opensearch modules
  • Integrated Geoplanet’s data into database backend from their data dump
  • Added old OS maps from the National Library of Scotland (these are great and easy to implement) to most of our maps, for example a search for ‘Sompting‘ axeheads and click on ‘historical’
  • Integrated the Ordnance Survey 1:50K dataset for antiquities and Roman sites
  • Integrated the English Heritage Scheduled Monuments dataset (only available to higher level users.)
  • Pulled in data from Amazon for our references (prices, book cover art etc) for example ‘Toys, trifles and trinkets’ by Egan and Forsyth
  • Mined the Guardian API for news relating to the Scheme
  • Created functions for the public to record their own objects and find previously recorded ones easily. This has been quite well received, see Garry Crace’s article on how he found it.
  • Used some semantic techniques (FOAF for example – our contacts page uses this in rdfa)
  • Context switched formats for a wide array of pages across the site
  • Got OAI access working
  • Created extensive sitemaps for search indexing

Database statistics

Some raw statistics of progress with the new database can be seen below:

24601 records have been created which documents the discovery and recording of 94,978 objects (one hoard of coins adds 52,503 objects alone – so remove these and you get  42475 objects). We also released functions that allowed the public to record their own objects, and this has resulted in the addition of 740 records from 32 recorders. We expect this number to increase following the release of an instructional guide produced by our Kent FLA and FLO – (Jess Bryan and Jen Jackson).

Users

User accounts created: 855 with no spam accounts created so far.

  • 2 Finds Adviser status
  • 38 Finds Liaison Officer status
  • 13 Historic Environment Officer status
  • 745 ordinary members
  • 57 Research status accounts

In the previous existence of our database over at findsdatabase.org.uk, we had 1135 accounts created in 7 years.

Research

58 new research projects have been added to our research register with the following levels of activity:

962,601 searches have been performed since relaunch. We’ve had 132 reports of incorrect data being published on our data (undoubtedly, there are more errors, people are just shy!) and 250-ish comments on records. These functions are both protected by reCaptchas and akismet and we’ve had 5 spam submissions in 6 months.

Contributors of data

943 new contributors have offered data for recording or become involved by recording or researching. I’m tidying up the database so that we can do better analysis of what people use our facility for. We now collect primary activity and postcodes, so that we can do some better statistical analysis.

Running costs for following domains:

www.finds.org.uk
www.findsdatabase.org.uk
www.staffordshirehoard.org.uk
www.pastexplorers.org.uk

Server farm hosting fee: £828
Bandwidth cost for excess load: £234
Remote backup space: £900 (350GB images)
Amazon S3 backup space: £0.27 ($0.42) for (11GB data transfer of MySQL backups)
Flickr licence: £15.30 ($24)
Get satisfaction account: £36.38 ($57) which I cancelled after 3 months due to the fact it was underused.

Development costs: Covered by my salary, not revealing that.

Total IT cost for running: £2013.95 (or around 8p per record or a more meaningless statistic because of the huge hoard, of circa 2p per object)
We plan to make this reduce further by switching backup to S3 for images as well or renegotiating with our excellent providers at Dedipower in Reading. Since the demise of Oxford ArchDigital, we’ve already made IT cost savings of c. £15,000 per annum in support fees and also all development work has been taken on in house.

Hopefully people are finding our new site much more useful, we’ve got more stuff to come….

Digging for Britain’s viewing figures

Alice Roberts in PAS towersThe recent BBC2 series on archaeology, ‘Digging for Britain’, was relatively well received in terms of viewer numbers. The Broadcasters’ Audience Research Board figures show the following figures (all figures are in millions of viewers):

BBC2 w/e 22 Aug 2010

  1. MATCH OF THE DAY (SUN 2202) 3.42
  2. DRAGONS’ DEN (MON 2102) 3.11
  3. COAST (WED 2000) 2.84
  4. THE NORMANS (WED 2101) 2.82
  5. DIGGING FOR BRITAIN (THU 2101) 2.75 – Romans, with lots of PAS stuff
  6. UNIVERSITY CHALLENGE (MON 2001) 2.49
  7. THE GREAT BRITISH BAKE OFF (TUE 2001) 2.24
  8. THE NATURAL WORLD (THU 2001) 2.19
  9. VEXED (SUN 2102) 2.09
  10. HAVE I GOT A LITTLE BIT MORE NEWS FOR YO (SAT 2101) 1.99

BBC2 w/e 22 Aug 2010

  1. DRAGONS’ DEN (MON 2102) 3.17
  2. MATCH OF THE DAY (SUN 2201) 3.12
  3. THE GREAT BRITISH BAKE OFF (TUE 2000) 3.00
  4. COAST (WED 2001) 2.94
  5. UNIVERSITY CHALLENGE (MON 2000) 2.59
  6. DIGGING FOR BRITAIN (THU 2101) 2.34 – Prehistory featuring Ben Roberts lovefest
  7. HAVE I GOT A LITTLE BIT MORE NEWS FOR YO (SAT 2103 2.29
  8. DAD’S ARMY (SAT 2032) 2.23
  9. EGGHEADS (THU 1800) 2.16
  10. EGGHEADS (WED 1759) 2.13

BBC2 w/e 5 Sep 2010

  1. COAST (WED 2002) 3.42
  2. THE GREAT BRITISH BAKE OFF (TUE 2002) 3.00
  3. DRAGONS’ DEN (MON 2102) 2.98
  4. ALEX HIGGINS: THE PEOPLE’S CHAMPION (WED 2102) 2.97
  5. UNIVERSITY CHALLENGE (MON 2002) 2.78
  6. DIGGING FOR BRITAIN (THU 2101) 2.45 – Anglo-Saxons
  7. ANTIQUES MASTER (MON 2032) 2.12
  8. E NUMBERS: AN EDIBLE ADVENTURE (THU 2001) 1.98
  9. EGGHEADS (MON 1800) 1.96
  10. EGGHEADS (WED 1800) 1.94

The only programme I can’t find figures for, is the last on the Tudors. That didn’t seem to be as strong and was shifted to a Friday night slot. (I fell asleep in it). The figures of over 2 million viewers consistently, may give hope for a second series being recommissioned; however, can it sustain the current format of a period per show? Will it rely on Alice Roberts as a presenter?

My experience of self-recording on the database

Well the long awaited new PAS database has landed and users out there in the land of archaeology and historical research are busying themselves adding new artefacts and data mining this fantastic and unique historical resource.  Dan the database builder man has received some well deserved plaudits for his new creation; a work of love if ever there was one, created entirely on a shoestring with fewer servers than Wimbledon in the closed season!

Time has elapsed since the beta launch and users have started to settle in to the day to day interaction using established as well as the new features it has to offer.  One of the principal benefits of the new database is the ability Joe Public now has to self-record any qualifying discoveries made.

I belong to a trio of detectorists called the Sussex Pastfinders, like many, our ethos is to give identity,  location and historical context to what we find, such that landowners, farmers, archaeologists, historians, scholars and the public alike can all benefit from the results.  As such we undertake a variety of local history projects, with all our qualifying finds being recorded with the PAS and each project illustrated and formally written up.  Whilst the majority of what we find we can identify ourselves, often the finer points elude us, and very occasionally we are closer to clueless.  In the peak detecting season we would see our heavily worked FLO starting to drown in the flood of recovered objects from across Sussex.  Given our reliance on the PAS to help complete our project reports, and with us working to promised landowner deadlines, we would often create pressure on the system and develop a backlog of finds for recording.  The new self-recording feature therefore now enables us to give some assistance in getting our finds landed professionally on the database.

As a self-recorder there is a wide spectrum of find data entry for the individual to get involved in.  At its most basic only the object type and broad period have to be completed before the record can be saved.  The system automatically generates a unique “PUBLIC” find number.  Keeping this number together with the find when handing it over to the FLO will help facilitate and smooth the recording process considerably and improve process efficiently.  The only kit required to do this is a computer with web access.  Subsequently the FLO will complete the required database fields and “promote” the object to be visible to all.  At the high-level end of the self-recording spectrum is the full identification, annotation and illustration of the find.  If this is done to the necessary PAS standard in the first instance, then with a ‘rubber-stamp’ of approval from the FLO the find is promoted to full view.

PUBLIC-E92C88PAS record number: PUBLIC-E92C88
Object type: Scraper (tool)
Broadperiod: Neolithic
County of discovery: East Sussex
Stable url: http://www.finds.org.uk/database/artefacts/record/id/391660

Fig 1: An example record I recorded on the Scheme database.

However, this Full-Monty recording does require access to comprehensive reference material, a balance accurate to a hundredth of a gram, a vernier gauge, and a reasonable macro photography set-up.  There are of course all points in between the two extremes.  The database entry is very intuitive, most of the fields are not free-form but simple to use drop down boxes, and if an error is made there is a simple edit option that allows the record to be corrected.  The arrangement I have with Laura our Sussex FLO is that if there are any gaps or clarification points in a record I have created then a few words of explanation are placed in the notes section of the record.  These are spotted on review, the record adjusted and the discussion notes deleted before it is promoted on the database.

Full Monty recording is a bit of an eye opener as to the required intricacies of find identification and recording procedures.  Clearly in a database of this size with many individuals inputting data, accuracy and consistency are paramount.  Researchers and the like are searching against various entry fields and if the recorders are logging objects differently the search function will not work accurately.  Following a day’s training from Laura I came to realise the conveyor belt of find identification moves at a speed required to ensure standards are upheld; moreover the standards are high and as you would expect professional.  If you don’t know the exact Parish you don’t guess you find out. If you can’t remember the exact object reference, then being vague will not do.  There is also a comprehensive guideline document that is issued to ensure overriding principles are followed and a ‘controlled vocabulary’ is published on the database itself as a reference to help ensure consistency of language.  The job done by our FLOs is sometimes unsung, seeing what’s involved makes you realise that what they do, for many if not all, is a vocation rather than a job, done for love not money – well done guys –much respect.

The biggest challenge I have found in self-recording on the database is the proper description of the object.  The guidelines are there to be followed but people express themselves in different ways.  There is obviously some latitude but the aim is to make the description self-contained.  Having produced a masterpiece of language without disappearing up your own dangling participle, the acid test is that given only the finished written description, could the find be fully understood and interpreted by the reader – it’s tougher to achieve than you think, especially when your spectacle buckles sound like bifocals!

With the two ends of the self-recording spectrum and indeed with all points in between, being able to personally contribute to a national database of this standing is unparalleled.  For me there is a satisfaction and pride to be taken in helping to see the whole process through from beginning to end, finishing in the knowledge that prior to your actions an historic prospective-find that was degrading somewhere in a field, and that perhaps in a few years time there would be nothing more than a stain left in the ground to betray its former presence.  Along you came, researched the location avoiding any archaeological sensitivity, land designations or schemes, found out who owned the land, gained their precious permission and that of the tenant farmer, spent hours searching and finally made the find.  It seems only fitting then to want to participate in completing the job by fully identifying and recording it.  So instead of just an anonymous stain in the ground, a historical object is saved from that oblivion, correctly identified, and has the best possible context restored to it for all to see, access, enjoy, and draw conclusions from.

At the time of writing there are over 700 PUBLIC recorded finds on the database, a number which is growing all the time (you can only view the ones that are promoted to public view – for example this search result.  If you would like to become a self-recorder at whatever level then do have a chat with your FLO and come to your own mutually agreeable arrangement as to how to record your finds.  If training is required they will I know be happy to oblige.

Digging for Britain – BBC2

This week sees the beginning of the new BBC2 series entitled ‘Digging for Britain’; it is presented by Alice Roberts and produced by 360 Production and sees heavy involvement from the Scheme and features several high profile discoveries. The screening dates are:

  • Episode 1 Romans – Thursday 19/08/10 21:00 BBC2
  • Episode 2 Pre-history – Thursday 26/08/10 21:00 BBC2
  • Episode 3 Anglo-Saxon – Thursday 02/09/10 21:00 BBC2
  • Episode 4 Tudors – Thursday 09/09/10 21:00 BBC2

The preview videos from 360 Production below, gives you some more details about what the series will feature, the first introduces the series and makes reference to the 360 production website; the second features one of the Scheme’s alumni, Caroline McDonald:

You can read more about this production on 360′s blog and there’s a few newspaper articles already floating around (listed below):

Adding old OS maps to findspot maps

Today on Twitter, David Haskiya alerted me to a set of old Ordnance Survey maps that have been scanned by the National Library of Scotland and turned into the  ‘NLS Maps API: Historic map of Great Britain for use in mashups’. These old maps are really useful (they cover England and Wales as well as Scotland!) for the work that our Finds Liaison Officers do, or for researchers using our database. Low level phenomenological research can be conducted.  Their instructions are pretty straightforward to follow and I have now added this layer to our findspot mapping (at the moment just for higher level users). The image below gives an example of the embedded Googlemap that we can produce from these OS tiles:

Old Map from NLS

Our maps now have the following layers:

  • Satellite
  • Terrain
  • Openstreetmap
  • Google Earth
  • Basic map
  • Hybrid
  • Historical

To implement this layer all you need to do is the following (I have Jquery as my javascript framework), firstly add the Javascript file that runs their tileserver to either your head tags or before the closing body tag of your HTML document.

[javascript]
<script type="text/javascript" src="http://nls.tileserver.com/api.js"></script>
[/javascript]

Then you need to initiate the layer and add the historical map button and copyright layer:

[javascript]

var copyright = new GCopyright(1, new GLatLngBounds(new GLatLng(-90, -180),new GLatLng(90, 180)), 1,
"Historical maps from <a href=’http://geo.nls.uk/maps/api/’>NLS Maps API<\/a>");
var copyrightCollection = new GCopyrightCollection();
copyrightCollection.addCopyright(copyright);var tilelayer = new GTileLayer(copyrightCollection, 1, NLSTileUrlOS(‘MAXZOOM’));
tilelayer.getTileUrl = NLSTileUrlOS;

var nlsmap = new GMapType([tilelayer], G_NORMAL_MAP.getProjection(), "Historical");
[/javascript]

You will then need to add the map type to your mapping script by adding the following javascript:

[javascript]

map.addMapType(nlsmap);

[/javascript]

So for example my code for running our map looks like the below (and I add this before the closing body tags, and using Zend Framework’s inlineScript syntax within my PHP script:

[javascript]
<script type="text/javascript" src="<a href="http://nls.tileserver.com/api.js">http://nls.tileserver.com/api.js</a>"></script>
<script type="text/javascript" src="<a href="http://maps.google.com/maps?file=API&amp;v=2.x&key=ABQIAAAAasv4kXXJ0jQKvwOWfHsLjBSlEYz08iyooQyuh_EGbYeUie1elhTVaZDZHd9xfLdYKWAVz9b3bDuvKA">http://maps.google.com/maps?file=API&amp;amp;v=2.x&amp;key={key}</a>"></script>
<script type="text/javascript" src="<a href="http://gmaps-utility-library.googlecode.com/svn/trunk/mapiconmaker/1.0/src/mapiconmaker.js">http://gmaps-utility-library.googlecode.com/svn/trunk/mapiconmaker/1.0/src/mapiconmaker.js</a>"></script>
<script type="text/javascript">
//<![CDATA[
$(document).ready(function() {

if (GBrowserIsCompatible()) {

//Set up the NLS layer

var copyright = new GCopyright(1, new GLatLngBounds(new GLatLng(-90, -180),new GLatLng(90, 180)), 1,
"Historical maps from <a href='http://geo.nls.uk/maps/api/'>NLS Maps API<\/a>");
var copyrightCollection = new GCopyrightCollection();
copyrightCollection.addCopyright(copyright);
var tilelayer = new GTileLayer(copyrightCollection, 1, NLSTileUrlOS('MAXZOOM'));
tilelayer.getTileUrl = NLSTileUrlOS;
var nlsmap = new GMapType([tilelayer], G_NORMAL_MAP.getProjection(), "Historical");

//Set up the openstreet map layer

var copyOSM = new GCopyrightCollection(‘<a href="http://www.openstreetmap.org/">OpenStreetMap</a>’);
copyOSM.addCopyright(new GCopyright(1,
new GLatLngBounds(new GLatLng(-90, -180), new GLatLng(90, 180)),
0, // minimum zoom level
‘ ‘ // no additional copyright message, but empty string hides entire copyright
));
var osmLayer = new GTileLayer(copyOSM, 0, 18, {
tileUrlTemplate: ‘http://b.tile.cloudmade.com/BC9A493B41014CAABB98F0471D759707/998/256/{Z}/{X}/{Y}.png’,
isPng: true,
opacity: 1.0
});

var osmMap = new GMapType(
[osmLayer], // list of layers
G_NORMAL_MAP.getProjection(), // borrow the Mercator projection from the standard map
‘OSM’ // name should be short enough to fit in button
);

//Initiate the map for the div with id of "map" – random Lat/lon pair used here – not a findspot!
var map = new GMap2(document.getElementById("map"));
map.setUIToDefault();
map.addControl(new GMapTypeControl());
map.setCenter(new GLatLng(51.263722,0.68009),11);
//Add your map types – here I have added OSM, NLS, Earth and Terrain
map.addMapType(osmMap);
map.addMapType(nlsmap);
map.addMapType(G_SATELLITE_3D_MAP);
map.addMapType(G_PHYSICAL_MAP);
//Set your default map type
map.setMapType(G_PHYSICAL_MAP);
map.disableScrollWheelZoom();
map.enableRotation();

//Set up my icons
var tinyIcon = new GIcon();
tinyIcon.image = "http://labs.google.com/ridefinder/images/mm_20_red.png";
tinyIcon.shadow = "http://labs.google.com/ridefinder/images/mm_20_shadow.png";
tinyIcon.iconSize = new GSize(12, 20);
tinyIcon.shadowSize = new GSize(22, 20);
tinyIcon.iconAnchor = new GPoint(6, 20);
tinyIcon.infoWindowAnchor = new GPoint(5, 1);
markerOptions = { icon:tinyIcon };

var findIcon = new GIcon();
findIcon.image = "http://labs.google.com/ridefinder/images/mm_20_blue.png";
findIcon.shadow = "http://labs.google.com/ridefinder/images/mm_20_shadow.png";
findIcon.iconSize = new GSize(12, 20);
findIcon.shadowSize = new GSize(22, 20);
findIcon.iconAnchor = new GPoint(6, 20);
findIcon.infoWindowAnchor = new GPoint(5, 1);

findOptions = { icon:findIcon };

var point = new GLatLng(51.263722,0.68009);

var marker = new GMarker(point, markerOptions);
GEvent.addListener(marker, "click", function () {
marker.openInfoWindowHtml("Findspot location");
});
map.addOverlay(marker);

}

});
//]]>
</script>
[/javascript]

So really simple to integrate and get running on your site.

Another milestone reached

On the 26th July 2010, the Scheme recorded the 400,000 record on the database; another Roman coins, this time a nummus of the House of Constantine. We had an internal challenge, with the Deputy Head down to buy the person who recorded this object, a bottle of sparkling wine. The landmark object is show below and was recorded by Tom Brindle, our acting FLO for Staffordshire and the West Midlands.

WMID-D6D183PAS record number: WMID-D6D183
Object type: Coin
Broadperiod: Roman
County of discovery: Shropshire
Stable url: http://www.finds.org.uk/database/artefacts/record/id/400298

Several FLOs expressed dismay, that the object was a Roman coin and a metal detector find, I think they were hoping for a lithic or something else found by a fieldwalker for a change… However, coins and metal detectorists are the best represented on our database….

Records Finds recorded Year of recording
3476 4588 1998
6128 8201 1999
11323 18106 2000
11481 16368 2001
8164 11996 2002
14657 21684 2003
26383 39000 2004
33919 52202 2005
37502 58311 2006
49308 79052 2007
37455 56449 2008
39981 66481 2009
112893 190091 2010

You might wonder why these figures don’t always match the Annual Reports; well, the database is constantly being worked on, errors corrected, finds removed if duplicate records  and so on. There’s some blips in the figures being recorded – 2002 for example being foot and mouth hit, in 2003 the Scheme went National and we phased in our new database and in March 2010 we imported 2 large datasets from IARCW and CCI (and you might have heard about 52,503 coins found in Somerset – only 1 record of those though – April). However the 2010 figures are encouraging when you look at the statistics for recording since we went live with our new database (shown below with a comparison to 2009, same period).

Statistics for 2009
Records Objects Month
3638 4395 1
2694 5410 2
2842 3414 3
3191 6284 4
3768 5229 5
3307 4429 6
3152 3819 7
Statistics for 2010
Records Objects Month
4290 12274 1
3509 5526 2
88596 90380 3
4191 57775 4
3957 5255 5
4490 14518 6
3860 4363 7

Using our data to place a google map on your own site (without the api)

This post is just a short overview of how you can get our data onto your website without being uber-geeky and knowing how to play with our Applications Programming Interface (API – more on this over the next month or so.)  The Scheme’s website can now serve up various different flavours of content by means of context switching. You can now get:

  1. RSS
  2. ATOM
  3. XML (finds lists and searches are returned in MIDAS format, other pages just plain XML responses)
  4. JSON
  5. KML
  6. CSV

To find out what versions of the content you can retrieve for a page is pretty simple. If you scroll towards the foot of any page on our website, look for the text:

This page is available in: {contexts available} representations.

This makes use of the Zend Framework context switch parameter -format. So any URL that has an alternative representation just needs appending format/{context}. So for example, you want to view all finds for Essex in ATOM format you would call this url:

http://www.finds.org.uk/database/search/results/county/ESSEX/format/atom

You can now use this output within your own site using simple software tools such as widgets, simplepie etc. However, what is probably of more interest to many people is getting a map of objects found locally to them. So for example, you run a parish council and you want all objects found in the district. Let’s try my home district of South Cambridgeshire. If you go to our advanced search facility and scroll to the bottom and choose county as Cambridgeshire and district as South Cambridgeshire, then submit the form and wait a second for the search to complete.

Now that the results are there, look at the page foot for the representations available and you’ll see the letters KML. If you click on this, you can now get data in the format that can be used in many online mapping programmes and Google Earth. So if you want to see this on the map, copy the URL generated; in this case:

http://www.finds.org.uk/database/search/results/county/CAMBRIDGESHIRE/district/SOUTH+CAMBRIDGESHIRE/format/kml

Now head over to http://maps.google.com.

In the search bar, paste the URL that you copied and press search.

Google maps search bar with url pasted in

The map should now change to show pins for degraded findspot locations. These pins are only provided when the ‘to be known as’ field has not been filled in and the actual points are taken from the 1km grid reference (4 figure). So the map should now render like the below image:

Google map generated from the KML

Now you have generated this map, you can grab either the link for the map and send directly to some one, or you can grab the HTML code to embed the map into a webpage. Look in the top corner of the map for the control labelled embed and click this; you then get the layer appearing which looks like the image below:

Link box from google

As this post deals with embedding the map on your own webpage, it is assumed that you can enter raw HTML directly. Copy the text which is contained in the box labelled “Paste HTML to embed in website”. This looks like:
[sourcecode]
<iframe width="425" height="350" frameborder="0" scrolling="no" marginheight="0" marginwidth="0"
src="http://maps.google.com/maps?f=q&amp;source=s_q&amp;hl=en&amp;geocode=&amp;q=http:%2F%2Fwww.finds.org.uk%2Fdatabase%2Fsearch%2Fresults%2Fcounty%2FCAMBRIDGESHIRE%2Fdistrict%2FSOUTH%2BCAMBRIDGESHIRE%2Fformat%2Fkml&amp;sll=37.0625,-95.677068&amp;sspn=47.885545,114.169922&amp;IE=UTF8&amp;ll=52.257917,-0.000189&amp;spn=0.72983,0.782087&amp;iwloc=lyrftr:kml:cF4oaez0SXhtHIuPUpXMoJUR9uPk2SiORITteHHHGjS0fvow5su0kSjIVdHy4TwDOfCcxM4bseHHEGTe2fPgy5si2VKsJEzIMAg,gf42aba810981b24d,52.065156,0.171661,0,-32&amp;output=embed"></iframe>
<br /><small>
<a href="http://maps.google.com/maps?f=q&amp;source=embed&amp;hl=en&amp;geocode=&amp;q=http:%2F%2Fwww.finds.org.uk%2Fdatabase%2Fsearch%2Fresults%2Fcounty%2FCAMBRIDGESHIRE%2Fdistrict%2FSOUTH%2BCAMBRIDGESHIRE%2Fformat%2Fkml&amp;sll=37.0625,-95.677068&amp;sspn=47.885545,114.169922&amp;IE=UTF8&amp;ll=52.257917,-0.000189&amp;spn=0.72983,0.782087&amp;iwloc=lyrftr:kml:cF4oaez0SXhtHIuPUpXMoJUR9uPk2SiORITteHHHGjS0fvow5su0kSjIVdHy4TwDOfCcxM4bseHHEGTe2fPgy5si2VKsJEzIMAg,gf42aba810981b24d,52.065156,0.171661,0,-32" style="color:#0000FF;text-align:left">View Larger Map</a>
</small>
[/sourcecode]

Then once you have pasted this code into your webpage, saved it and if you aren’t using a content management system, upload it to your website and then the map will be embedded as shown below:

View Larger Map

In the infowindow bubbles that come up when you click on a findspot location, you will see this text:

This findspot has been produced from the 4 figure reference. It is not the precise findspot.

As mentioned above, due to findspot security/ landowner privacy, and an agreement we have with the major body that gives us artefact spatial information, we cannot publish co-ordinates publicly at a precision greater than parish or 1km square (4 figure grid reference) and we also hold back from view finds that have had the “to be known as” field. Therefore, the map you get from this is not 100% accurate! This is not something we can change.

A couple of weeks ago, we sent a mailshot out to all MPs for England and Wales, detailing how they could get finds for their constituency onto their own webpages. This is done in exactly the same way as the above and constituency finds feeds can be obtained from the news section of the website under (and powered by YQL calls of the theyworkforyou API):

http://finds.org.uk/news/theyworkforyou/constituencies

Two examples with finds in their constituencies are the coalition leaders (the Roman coin hoard from Frome announced on the 8th July, had a colalition type coin inside). David Cameron’s constituency of Witney shows this map:

And Nick Clegg’s Sheffield Hallam constituency shows this map:

Once geoRSS is enabled and working properly, you can also do the above using any of the feeds for finds where the context switch called is ATOM. This will be done by the middle of next week, alongside ATOM paging.