Monday 28 September 2015

Extracting building heights from LIDAR

The UK Environment Agency have released some LIDAR data in Digital Elevation Model (DEM) format. It includes Digital Terrain Model (DTM) data and Digital Surface Model data. The DSM data includes buildings and trees while the DTM is processed to remove these so the underlying terrain is visible. Tim Waters asked if you subtract the DTM from the DSM would you be left with just building and tree heights? I'd started to look at this and it turns out building heights are extractable in this way.

I have written a script to do the subtraction and create both the difference file and an SQL file to load the data into a postgresql table. I created a hill shading image from one of the difference files to see what features have been stripped from the DTM data. Here's a jpg version of it:


You can see that the buildings and some other features are all well defined. Phil Endecott suggested using the DSM data to create building outlines which could be traced in OSM. His images look good. I would suggest starting his process with this difference data as all the terrain detail has been removed so it may be even better.

Once the SQL file of height data has been loaded into a Postgresql database, which has the PostGIS extension installed, we can then do some queries on it. I selected some OSM building polygons in the range of the loaded data and found which of the height points fell within each polygon. The highest of these is the highest point of the building above the surrounding terrain. I've written a script as a place-holder to extract all the heights for a rectangular area. Someone could extend this to make a file to upload to OSM to add the heights for every building in the defined area. I feel this is clearly an import and so the usual OSM import process needs to be gone through before the import takes place.


I think there is real merit in using this data to extract building heights, which are needed for the 3D images of city buildings.

The two scripts are available here: http://www.raggedred.net/shared/heights.zip The first (s-t.py) needs matching DSM and DTM files and outputs a difference file and the SQL to load into a database table. This will work for any of the resolutions published by the EA. The second, much less polished, script (match.py) defines a rectangle, extracts the OSM buildings for that rectangle and then finds the height data for each building. I wrote it as a script so it can be extended to create a data file for processing or uploading or so overlay tiles could be made from it. I loaded some OSM data with osm2pgsql (which would normally be used for rendering) and added the table for the heights data to the database. The SQL for the table is:

CREATE TABLE eadata
(
  hid serial NOT NULL,
  height double precision,
  locn geometry(Point,27700),
  CONSTRAINT eaheight_prim PRIMARY KEY (hid)
)
WITH (
  OIDS=FALSE
);

CREATE INDEX eadata_index
  ON eadata
  USING gist
  (locn);
The output SQL data can be loaded into this with the command:
psql -f
where and are whatever you used. The table name eadata and field names are hard-coded in s-t.py. 

It is important to say that I would not use a rendering database to create the upload data from as some fields will be missing. osm2pgsql is a lossy process. You can use the OSM ID to extract a current version from the API or from Overpass to add the height data to. I used the rendering data for convenience as I already had it available and to satisfy myself that the process works.  

I hope this is useful to someone. Please feel free to ask for more information if I've not made anything clear.

6 comments:

Unknown said...

What level of accuracy do you think you are getting? How far is it possible for you to process the data then hand over to someone who has the time and energy to explore the import side of things?
RobJN

Chris Hill said...

The data is supposed to be to the nearest millimetre. I think the process is pretty robust and finds the highest point of each building pretty reliably, but I would be conservative and use the height to the nearest cm which would be perfectly usable. The height=* tag expects the highest point of the building.

There is a lot of data to process; 1 km² needs 1 million database rows. Hull, a moderate city, needs about 80km². This is temporary data, but nonetheless I can't imagine me creating data to cover much of Britain. The process needs existing building outlines to compare the heights to and the EA data does not cover all of Britain - it is intended for flood assessment.

Are you intending using any of it? Would you like an area (say 1km²)to test?

Martin Isenburg said...

I think the accuracy is around femtometers ... (-:

http://rapidlasso.com/2015/09/02/england-releases-national-lidar-dem-with-insane-vertical-resolution/

Chris Hill said...

@Martin, the first release used a spurious precision to less than the diameter of an atom. They fixed that now to 3dp of a metre.

RobJN said...

Chris, I met with the Mappa Mercia group last night and we agreed that we'd be interested in the Worcester data (I've checked and it looks like there is data available there). As per your kind offer above can you please send me a copy of this when you have the time to do the required processing. My emails on the mailing lists (example: https://lists.openstreetmap.org/pipermail/talk-gb-westmidlands/2015-October/001910.html ). Thanks. RobJN

jhon said...

thanks for grea post