February 12, 2011

Cooking Hydrography in SpatiaLite


Alessandro Furieri's latest cookbook includes a fine dining recipe with the main ingredients being Dijkstra's algorithm and the Tuscan road network. This inspired me to attempt a similar recipe on a hydrographic network.

In this tutorial we will use SpatiaLite, an open source GIS extension to SQLite, to create a network with data from the National Hydrography Dataset.

Software used:
Windows XP
Quantum GIS 1.6.0 (Standalone installer from http://www.qgis.org/wiki/Download)

Open SpatiaLite GUI and create a new database by clicking on the second button from the left. Call it db.sqlite. This creates the database file and initiates the spatial metadata tables, creating geom_cols_ref_sys, geometry_columns and spatial_ref_sys.

Download the "National Hydrography Dataset with Flowtable" for subbasin 10190004 as a shapefile here. (You can download data for any part of the country from National Map Viewer. You will receive an e-mail with a link a ZIP file in a couple of hours or sometimes days.) Unzip the file to its own folder. Click on the "Load shapefile" button in SpatiaLite GUI to import NHD198255\hydrography\NHDFlowline.shp with SRID 4269.

In Excel open the DBF file NHD198255\NHDFlow.dbf and save it as a CSV. If you don't have Excel see this post at Spatial Galaxy on how to use ogr2ogr.

Load CSV from SpatiaLite GUI by clicking on the "Load CSV" button. Select comma as the separator.

Update the LengthKM with the glength function and a transform to the local UTM zone. Divide by 1000 to get kilometers. This column will be used as the cost. Just type the following statement in the upper right window of SpatiaLite GUI:
update nhdflowline set lengthKM = glength(transform(geometry,32613))/1000
At this point your SpatiaLite GUI window should look like this:
Click the green arrow button to the right of the screen to execute the statement.
Create a ToComID in the NHDFlowline table and update it with info from the NHDFlow table. We need to have all the arcs, the arc geometry and nodes in one table for the network builder to work. The NHD network does not explicitly define the arcs and nodes, so we'll just say that the "from node" is the arc itself, and the "to node" is ToComID.
update NHDFlowline set ToComID=(select ToComID from NHDFlow where NHDFlow.FromComID=NHDFlowline.ComID)
Replace null values from the ToComID column with a value of 1. (That's an arbitrary number. You can pick another recognizable value, just not something negative or building the network will fail. You should also not pick the ComID of another existing feature.)
update NHDFlowline set ToComID =1 where ToComID is null
Perform the validation from the command line. For this to work you have to run spatialite_network.exe from the bin directory found in the spatialite-tools zip file.
>spatialite_network.exe -d ../db.sqlite -T NHDFlowline -f ComID -t ToComID -c LengthKM -g Geometry
Take care of any problems that may arise. Then back in the GUI click the "Build Network" button. Table= NHDFlowline, NodeFrom Column= ComID, NodeTo Colum = ToComID, Geometry Column = Geometry, Arc connections = Uni-directional, Cost type = Using cost column, Cost column = LengthKM

The NHDFlowline_net virtual table should have been created. Time to take it for a test ride. The solution is given by this query:
select * from NHDFlowline_net where nodefrom=158717400 and nodeto = 116932867 
The first row gives the entire solution and its Geometry. The following rows sequentially identify the arcs used by the travel solution.
Let's create a separate table to store the solutions so that we can visualize them. In addition to an id field, the solutions table contains the same fields as the NHDFlowline_net virtual table.
CREATE TABLE solutions (
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
ArcRowid,
NodeFrom,
NodeTo,
Cost);

Register this table in the geometry tables. (When you import shapefiles this step is automatically performed by the GUI, but when you create your own spatial tables, they have to be registered. AddGeometryColumn creates an entry in geometry_columns table and two triggers. PostGIS does things in a similar way. Read this part of Alessandro's cookbook for more about this.)
SELECT AddGeometryColumn('solutions', 'Geometry',4269, 'LINESTRING', 2);
A little globe will attach itself to the icon of the solutions table, indicating that solutions is a spatial table
Add a couple of solutions into the table like so
insert into solutions (ArcRowID, NodeFrom, NodeTo, Cost, Geometry) select * from NHDFlowline_net where nodefrom=158717400 and nodeto = 116932867 limit 1
Open QuantumGIS and add the NHDFlowline table and the solutions table as layers. Give each solution a different color.
And there you have it. Serve with a Chianti.