Monday 1 June 2015

OS Open Names

As I noted in the previous post Ordnance Survey have said that OS Locator open data is being withdrawn and being replaced by OS Open Names. The file is a strange mixture of three types of data jumbled up. So I needed to separate them into something useful. The three types are postcode centroids, place names and road names. All of the locations are in the projection that Ordnance Survey use, OSGB36 or EPSG:27700. The great thing about the values used in this projection is that they are measured in metres from a fixed point. The previous OS open data, such as OS Locator or CodePoint Open (postcode centroids) use this projection too of course which fixes a location to a square metre. OS Open Names however has a decimal component so the location is now specified to the nearest millimetre - more accurate than OSM works to. The east-west specification in OS parlance is know as eastings and the north-south specification is known as northings. Eastings and northings can be converted to longitude and latitude for use in OSM using, for example, GDAL libraries.

The postcode centroids are very simple records. It needs the postcode and the easting and northing for the location of the centroid. Comparing a few OS Open Names centroids with Codepoint Open records the centroids are in slightly different locations.

OS Locator lists named roads with its name, a centroid and a bounding box for the road. There is also a hierarchy of place, borough, county or unitary authority.  All of this is in OS Open Names and the increased millimetre accuracy is there too.

I've not used the gazetteer of place names, but the name and location data are available as you would expect.

All of the data types have some URIs in the files too for many of the fields. Many that I have tried to open are dead links, but some show the hierarchy of data. I'm not sure why this is useful.

I think I'm going to write a routine to unzip and process all of the OS Open Names data. I'll load the data into a database, reproject it for OSM and make the processed data available if anyone wants it.

First, however, OS need to supply the missing sections of their open data.

2 comments:

Robert said...

I'm working on doing something interesting with OS Names along the lines of musical chairs. One of the interestingly "missing" things in Names is that they have decoupled road numberings and road namings. In Locator the two fields were bound to the same entry. In OS Names a road name and a road numbering are two separate entities. As a result you could previously see that "Foobar Road" was the "A1230" (at this point at least). In OS Names, the best you can infer is that "Foobar Road" and the "A1230" share part of the same bounding box. They may not be connected in any real way.

Chris Hill said...

@Robert,
Roads can be a "named road", a "numbered road" or a "section of named road". It looks to me as though roads with a name are not always repeated as a numbered road. If I'm right that level of detail is completely lost, as well as the issue you point out.