## Ordnance Survey 1:50 000 Gazetteer import and reuse

On April Fool’s day, the Ordnance Survey opened up its data for people to reuse with less restrictions applied. At the heart of everything we do, place perhap the most important. The Scheme uses National Grid References and place names that you would find on an OS map. The things that these maps depict, often inform where people discover objects; they represent habitation in a past and present form, sometimes concurrently, sometimes from anitiquity. The last word in that sentence is the key to why I wanted to use the 1:50 000 data in our web application (our database). Two categories of place are defined within this dataset:

• Roman Antiquity – of which there are 237 instances
• Antiquity – of which there are 5252 instances

If you download the 1:50k dataset from the Ordnance Survey or from the mysociety cache (remember they are charity, so don’t abuse their servers), there is a document that outlines what the fields mean in the dataset. The important one here was the f_code or feature code column. The data is available via SPARQL (see Leigh Dodd’s article on this), but I wanted to keep a local copy of this data on my server so that I could use it and transform it for some other tasks. After downloading and unzipping onto our server (placing it into the /tmp folder will save you getting error 13 codes with the mysqlimport later), I then created the following MySQL table:

[bash toolbar="true"]
CREATE TABLE osdata (
id int(11) NOT NULL,
km_ref char(6) collate utf8_unicode_ci default NULL,
name char(60) collate utf8_unicode_ci default NULL,
tile_ref char(4) collate utf8_unicode_ci default NULL,
lat_degrees int(2) default NULL,
lat_minutes float default NULL,
lon_degrees int(2) default NULL,
lon_minutes float default NULL,
northing int(7) default NULL,
easting int(7) default NULL,
gmt char(1) collate utf8_unicode_ci default NULL,
county_code char(2) collate utf8_unicode_ci default NULL,
county char(20) collate utf8_unicode_ci default NULL,
full_county char(60) collate utf8_unicode_ci default NULL,
f_code char(3) collate utf8_unicode_ci default NULL,
e_date char(11) collate utf8_unicode_ci default NULL,
update_code char(1) collate utf8_unicode_ci default NULL,
sheet1 int(3) default NULL,
sheet2 int(3) default NULL,
sheet3 int(3) default NULL,
PRIMARY KEY (id)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci COMMENT=’OSDATA 1:50000′;[/bash]

I then needed to import this data, which I accomplished using mysqlimport command in my terminal as below (I use putty at work and OSX terminal at home). Note that I renamed the gazetteer data to the same table name as the MySQL table and retained the txt extension. Fields are delimited by a colon and there is no header row.

The import should run through and insert 259080 rows of data. Even though I am only interested in the antiquity type fields, I have imported the lot in case I want the rest later. Now I have the data installed, I can manipulate it and use it in the way that I want. If you know your grid references, then it is apparent that the data presents at just 1KM square resolution; this is the maximum precision level to which we publicly display our find spots, so it will tie in quite nicely to the public display of information.

However, I wanted decimal degrees for the latitude and longitude within my table. I therefore inserted two new rows into the MySQL table – latitude and longitude (DOUBLE) and then used a PHP function to convert the 1km square grid reference into lat/lon values. I’ve also done some further manipulations to get the imprecise centred 1KM grid reference WOEID to get enhanced geographical data.

Now the grid reference has been converted into decimal values, I can now plot these quickly onto a Google Map or use mathematical formulae to get distance from a point; for example the Haversine.

$d = R \, {haversin}^{-1}(h) = 2 R \arcsin\left(\sqrt{h}\,\right)$

There’s s some very good discussion and code available on these resources, so there’s no point reinventing the wheel:

• Haversine formula can be obtained from various sources, and this page lists 9 scripting language variants.
• Vincenty formula can be found on the Movable Type script pages (via @codepo8).
• More detailed explanation can be found at the Seventh Sense blog

For the Scheme’s database, I wanted to work out if objects were close to an antiquity or Roman antiquity or whether a MP’s constituency or district has them within their bounding box (as found from querying theyworkforyou’s API – more on that later.) As I use Zend Framework, I created a model and then used a view helper to render data onto a finds record. If you’re interested in my code for this, you’re welcome to have it….this one uses the Haversine and is just the model that generates a MySQL query against my database table.

[PHP toolbar="true"]
<?php
class Osdata extends Zend_Db_Table_Abstract
{
protected $_name = ‘osdata’; protected$_primary = ‘id’;

public function get50KNearby($lat,$long,$distance,) {$radius = 6378.137; //KM value, swap to 3960.00 for Imperial miles
$pi = ’3.141592653589793′;$nearbys = $this->getAdapter();$select = $nearbys->select() ->from($this->_name,array(‘name’,'id’, ‘latitude’, ‘longitude’,'distance’ => ‘acos((SIN(‘.$pi.’*’.$lat.’/180 ) * SIN(‘.$pi.’* latitude /180)) + (cos(‘.$pi.’*’.$lat.’/180) * COS(‘.$pi.’* latit$->where($radius . ‘ * ACOS((SIN(‘.$pi.’*’.$lat.’/180) * SIN(‘.$pi.’* latitude/180)) + (COS(‘.$pi.’*’.$lat.’/180) * cos(‘.$pi.’* latitude /180 ) * COS(‘.$pi.’* longitude /180 -’.$pi.’* ( ‘.$long.$
->where(’1=1′)
->where(new Zend_Db_Expr(‘f_code = "R" OR f_code = "A"’))
->order($radius . ‘ * ACOS((SIN(‘.$pi.’*’.$lat.’/180 ) * SIN(‘.$pi.’* latitude/180)) + (COS(‘.$pi.’*’.$lat.’/180) * cos(‘.$pi.’ * latitude /180 ) * COS(‘.$pi.’* longitude /180 – ‘.$pi.’* (‘.$long.’$return$nearbys->fetchAll(\$select);

}
}

}
[/php]

Below is a record of an object from the controversial Water Newton rally, which was near the site of the Roman town of Durobrivae, near Chesterton in Cambridgeshire [WOEID: 39263, 1KM NGR - TL1295, Lat: 52.561508 Lon: -0.364190].

You’ll see that there are two Scheduled Monument Alerts – there are actually 5 entries in the National Monuments Record for this SAM – and there is 1:50k OS alert for a Roman antiquity. These SAM alerts aren’t shown to users below ‘research’ level. I can then click through to find all records associated within a certain distance of this OS point, and map them if I have at least ‘Research’ user rights on our database. In future, I will try and link these place names through to other linked data resources. By tying them to a WOEID, I can find archaeological photos on Flickr for example.

I don’t think that this breaches the OS licence and there are probably other ways to accomplish this in PHP, I just dabble in code, so don’t rip me to shreds….

## Access levels and what you can view

Following our Portable Antiquities Advisory Group meeting, I was asked what levels of detail people are privy to on the Scheme’s database. The below outlines what these account levels can do/see and what geo information is displayed.

### Public user – not logged in

The public user level is the most basic of all our levels of access. This gives you access to:

• Finds awaiting validation (denoted by the yellow flag)
• Low level mapping
• no dots on maps
• findspot to 1km grid square level and slight obfuscation of findspot by randomised subtraction/addition of 10ths of a degree to the degraded findspot
• limited zoom level.

### Registered user – most basic level of login

• Finds awaiting validation (denoted by the yellow flag)
• They can create their own records of their objects and get full mapping capabilities for only these objects which is enhanced over the below low level grade map.
• Low level mapping
• no dots on maps,
• findspot to 1km grid square level and slight obfuscation of findspot by randomised subtraction/addition of 10ths of a degree to the degraded findspot,
• limited zoom level.
• Cannot see maps or retrieve finds by parish for any record with the findspot form’s “to be known as” field completed
• Can add/edit their own records
• Can save searches

### Researchers

• Finds awaiting validation (denoted by the yellow flag)
• Cannot view finds that are still in progress (quarantine/review)
• As above, they can create their own records of their objects and get full mapping capabilities forthese objects.
• High level mapping
• findspot plotted with a dot on the map
• full precision for findspot
• Flickr shapefile outline for parishes
• Full zoom capabilities
• Can add/edit their own records

### Historic Environment Officers

• Finds awaiting validation (denoted by the yellow flag)
• Cannot view finds that are still in progress (quarantine/review)
• As above, they can create their own records of their objects and get full mapping capabilities forthese objects.
• High level mapping
• findspot plotted with a dot on the map
• full precision for findspot
• Flickr shapefile outline for parishes
• Full zoom capabilities
• Special download of csv for import into exeGesis HBSMR (if you don’t know what that is, don’t worry!)

### Treasure & Finds Liaison Officers

• Finds in quarantine – records that need more data (reminds them to do so!)
• Finds on review – current working versions
• Finds awaiting validation (denoted by the yellow flag)
• As above, they can create their own records of their objects and get full mapping capabilities forthese objects.
• High level mapping
• findspot plotted with a dot on the map
• full precision for findspot
• Flickr shapefile outline for parishes
• Full zoom capabilities
• Can edit records made by member, HERO and research users
• Can edit any records they made when working in other counties
• Can edit records made by anyone at their institution

• Finds in quarantine – records that need more data (reminds them to do so!)
• Finds on review – current working versions
• Finds awaiting validation (denoted by the yellow flag)
• As above, they can create their own records of their objects and get full mapping capabilities forthese objects.
• High level mapping
• findspot plotted with a dot on the map
• full precision for findspot
• Flickr shapefile outline for parishes
• Full zoom capabilities
• Can edit any records created by any user and can publish finds

Now that would be telling.

## Adding records to our database as a registered member

The Scheme’s database has changed significantly since it went live in its original format in 1999. It is now possible for all users to add and edit their own finds (descriptive, spatial, numismatic, reference and visual details) and add to this country’s archaeological record of public discovery. To add your own ‘finds’ to our database is relatively straight forward and this post outlines how to do this. As with a few other features of this site, you need to get a few things in place before it works properly for you!

#### So how do you record?

1. Register for a user account on our site (or if you already have an account and haven’t logged in since 21st March 2010, reset your password). We need to have you registered for auditing changes and notifications etc. Your personal details won’t be sold or divulged to evil marketing companies or anyone else who hasn’t sighed up to our T&C.
2. Contact your local Finds Liaison Officer and talk to them about self-recording your objects (we have a strict vocabulary for data entry and there’s some things you might like explained before proceeding). We have to use strict terminology to ensure that things can be found easily and that we can interoperate with other people’s databases.
3. You can only record your own finds as we can’t divulge other people’s details under the Data Protection Act (sorry!) Once we link your personal details to your account, you can see your own records easily and your name gets appended to records created by you automatically.
4. If you have a Treasure object, we would rather that this is reported directly to the FLO for recording so that all the steps needed to dispense the law are followed and no confusion arises (sorry!)
5. Once you have spoken to your FLO, you can happily record away! So keep reading.

#### Adding a find’s basic information

1. Once you have logged in, look for the button labeled “Add a new object (or artefact in some places)” on either your home screen or on the artefact listers, click on this.
2. Now you can fill in the data for your find.  Many fields are strictly controlled by driven vocabulary – for example, object type auto-completes and others are select driven drop-downs. Most are pretty obvious! Just follow the labels to the left of each form control.
3. Out of all the fields, the only compulsory ones that you must enter are object type and broadperiod, therefore you can start records and return to them. However, we want really complete records with as much information as you can give (you can edit later of course).
4. Once you have filled in your form, press submit and you will be taken to your record and you can now add extra bits. We haven’t adopted multi page forms as you might not have all details at hand and we’re trying to make it all very simple…
5. We also use FCKEditor and HtmlPurifier to ensure valid HTML in the data that you enter. We’ll strip out a wide variety of tags generated by word if you paste from there and also remove curly quotes etc. If you are interested (which probably you aren’t) we store your text in UTF-8.

1. You now have a choice of which bits of data you want to add to the record (if you have entered a coin, you can add numismatic data) and for this example we’re assuming you are entering a Roman coin. So to add numismatic data, look for the link entitled “add numismatic data”. Click on this.
2. This step is driven by logic determined by the denomination type you have. If you choose a denomination, we set in motion a series of cascaded or linked dropdowns.
3. Once you have chosen a denomination, then choose a ruler from the list that is generated (you can’t enter a ruler that doesn’t exist for a denomination type!)
4. After a ruler has been chosen, the cascade sets in motion again and configures the mint, moneyer (only available for Republican coins), reverse type (only available for 4th Century coins) and Reece period. Choose the correct option for your coin if you can fill it in. If not leave blank.
5. Enter any information for reverse/obverse inscription/description
6. Choose die axis measurement and status options

Provenance is vital for the study of stray archaeological finds. The majority of objects we record will have little or no archaeological context and are found in the plough soil, but their spatial co-ordinates may well tell you more about the area’s archaeology. By providing the Scheme with higher degrees of precision for your findspots, the better the research academics and lay researchers can do from these data. The form for recording the spatial data is again pretty straightforward and you have the option to hide sections from public view (comments, address, postcode, all co-ordinates). The below outlines how to enter the spatial information for a findspot (all finds can only have one!)

1. All objects are attached to a named place. We use the Ordnance Survey’s place name data, so we have an  array of data to choose from (Euro-region, County, District, Parish). These place name drop-down lists are also cascaded, so start by choosing your county and then follow the dropdowns choices as presented.
2. To hide the data entered in step 4 from the public, you can enter a pseudonym in the “known as” box (be sensible about it )
3. If you have an address and postcode for the findspot, please fill these in (these never get displayed to the public or research user).
4. Now we need to get the co-ordinates for the findspot. If you don’t have a provenance for the find, we’d rather it wasn’t recorded as it doesn’t add to our useful archaeological record. If possible, record to a higher precision than 4 figure grid reference (which is better than 1km square precision) – this is the maximum level we’ll publish data online to the public user. We also use the National Grid reference system to place our objects onto a map and this is transformed into the following after saving:
1. Easting
2. Northing
3. 4 figure grid reference
4. Latitude and longitude pair
5. Elevation on landscape
6. 1:25K map
7. 1:10K map
8. A Yahoo! Where on Earth ID or WOEID for cross referencing against their database (you can see this in action via the adjacent places displayed on the findspot section) and other services that may use their identifier system.
5. After filling in this section you can tell us about the landuse types and any comments or descriptions needed about the findspot.
7. This returns you to the record, where you will now see a map of your findspot and the data that you have entered. You’ll also see any data we’ve managed to retrieve on that area from Yahoo! – a Flickr shapefile for the parish (if available), adjacent places (from the geoplanet database) and postcode etc. More is planned for this section, news on that later!

#### Adding an image or images

A visual record of the object is really important for research of the object (for many researchers it is often more important than the findspot!) Adding an image is quite straightforward and we add one at a time to make it more simple. We do suggest naming your files sensibly, avoiding non-alphanumeric characters and removing spaces (replace with an underscore, hyphen or camel case the  filename).

To add an image do the following:

2. After clicking on this, you’ll see a form with several fields. Click on the ‘choose’  button to find your image that you want to attach (must be under 6MB and we would rather that you uploaded a high resolution JPEG or TIFF image.) If your filename already exists, we’ll tell you and likewise if it is an invalid filetype.
3. When you get to the image label box, refer to the image labeling document produced by our Finds Advisers for the correct methodology that we want to adhere to.
4. Choose your county, copyright, period and image type (your default copyright can be set from the edit account link under your home area. Set it and then logout and back in so that the session picks up the default.)
5. Then submit the new image
6. If everything works okay, you’ll then get redirected back to the record and you can add a new image if needed
7. This process generates:
1. Thumbnail
2. Small derivative
3. Display derivative
4. Medium derivative (used for the lightbox overlays)
5. A Zoomify derivative
6. Original image
8. All users can download the original image – we share everything on this site!

#### Taking good quality images

This is really important for the record of the object, you can get some good advice on-line or by purchasing Ian Cartwright’s short guide entitled ‘Photographing detector finds’.  If you want good examples of images, have a look at our FLO for the Isle of Wight, Frank Basford’s records.

Basic desires from the Scheme for images are:

• 300 dpi resolution (so images can be reproduced for academic publication)
• good lighting
• a white or black background
• well focused with good depth of field
• scale bar – you can get some good ones from here: http://www.vendian.org/mncharity/dir3/paper_rulers/

Images and text on this website are disseminated under a Creative Commons Non-Commercial Share-Alike licence and are used in a variety of media for enriching our knowledge of the past. You can opt out of this by choosing ‘all rights reserved’ under the image copyright dropdown, or by choosing this as default from your profile settings.

We’ll have some more information on photography and scanning of objects and coins in the coming weeks.

#### How do my records go public?

As we’re trialling public data entry, we’re currently keeping all public records hidden from view in the review stage (you’ll see a quarantine flag- black biohazard symbol ~ I didn’t like black flags ~ next to your record number when you look at your my finds list). If this feature gains popularity, we’ll extend the system so that you do the following:

1. Enter all data
2. Decide it is ready for checking by our staff and you choose to push your record to review
3. The object record is peer reviewed by the FLO for the county of origin of the object
4. They then decide that it can be seen and it then goes to validation
5. Eventually a finds adviser might check it and it will go green for published.

You can only edit your records in the quarantine (a legacy phrase from our old system) and review stages, if you have more to add at a later date, you can ask any of the Scheme’s staff to return it.

#### I’m still struggling with the above!

If you still need help self-recording objects, you can go firstly to your FLO for help and more information on our recording philosophy or contact us at the Central office on info@finds.org.uk