Introduction
I sometimes need to show regions on a map. However, often there are no freely available (KML, shape, GeoJSON) geometries or topologies available that I can use, so I have to create them (almost) from scratch, e.g. based on existing city geometries. As these geometries tend to be too detailed, they have to be simplified as well.
As I tend to forget the way I need to do this, I thought it would be useful if I put it in a small article, so others (and myself) can use it as well. The example I'm using is based on Dutch open source data, so although the technique is generally applicable, the final result is probably not.
Background
In The Netherlands, the Dutch CENSUS (CBS) provides us with the geometries of all cities. New regions are often formed by creating a union of the existing city boundaries, and here I will demonstrate how to do that. You need to have a basic understanding of PostgreSQL and its spatial extension, PostGIS.
Method
Let me first show what the end result will look like for the 43 WMO regions (image created by GeoJSONLint).
Importing the CBS Data
- Download the data and extract the zip file.
- Using PgAdmin, create a new database (cbs) for the data.
- Enable the PostGis extension (in the new database, right-click on extensions and select 'New extension'. In the popup that opens, select postgis).
- Open a command prompt and load the data using the following command, where
SRID = 28992
and DATABASE = cbs
.
shp2pgsql -I -s <SRID> <PATH/TO/SHAPEFILE> | psql -d <DATABASE>
The final result is that we have the CBS shape files loaded into our database. In this case, we are only going to need the created gem_2013_v1
table. But before we can go on, we need to do one more thing: as the CBS data is in RD coordinates ('Rijksdriehoek
', or SRID 28992
), we need to set the spatial reference ID (SRID
) for our geometries manually. I have no idea why shp2pgqsl
does this automatically (am I missing something), but in order to support future SRID
conversions, this is a requirement. So select the cbs
table in PgAdmin
, open a SQL window, and enter.
SELECT UpdateGeometrySRID('gem_2013_v1', 'geom', 28992);
BTW, you can check the SRID
using the following command:
SELECT Find_SRID('cbs', 'gem_2013_v1', 'geom');
Creating a Mapping Between City Boundaries and the New Regions
As an example, I am going to create boundaries for the WMO health regions in The Netherlands. Although I can download it as a PDF or image, they unfortunately don't provide an option to download it in a GIS format, so I download the CSV format.
I'm not going to include an Excel tutorial, but I convert the downloaded data so I get something that looks like the following picture. Here, I create a match between a city code (gm_code
) and a wmo_code
.
Next, I create a new table 'regios
', right click on it and select import. Options: specify file, set to csv, specify it has a header, and the delimiter = ';'
Creating a New View with the Combined Geometry
Now it is time to create our new WMO geometries. Basically, this is just a 'simple' SQL query where you union the city boundaries, but as I am not very good in it, it took me some time to get it right. First, use a LEFT JOIN
where you use the previously created regions
table to match cities with wmo regions
.
CREATE OR REPLACE VIEW public.wmo AS
SELECT r.wmo_naam,
st_union(st_transform(g.geom, 4326)) AS geom,
count(g.gm_code) AS aant_gem,
sum(g.aant_inw) AS aant_inw,
sum(g.aant_man) AS aant_man,
sum(g.aant_vrouw) AS aant_vrouw,
round(100::numeric * sum(g.p_00_14_jr * g.aant_inw / 100::numeric) / sum(g.aant_inw)) AS p_00_14_jr,
round(100::numeric * sum(g.p_15_24_jr * g.aant_inw / 100::numeric) / sum(g.aant_inw)) AS p_15_24_jr,
round(100::numeric * sum(g.p_25_44_jr * g.aant_inw / 100::numeric) / sum(g.aant_inw)) AS p_25_44_jr,
round(100::numeric * sum(g.p_45_64_jr * g.aant_inw / 100::numeric) / sum(g.aant_inw)) AS p_45_64_jr,
round(100::numeric * sum(g.p_65_eo_jr * g.aant_inw / 100::numeric) / sum(g.aant_inw)) AS p_65_eo_jr,
round(100::numeric * sum(g.p_ongehuwd * g.aant_inw / 100::numeric) / sum(g.aant_inw)) AS p_ongehuwd,
round(100::numeric * sum(g.p_gehuwd * g.aant_inw / 100::numeric) / sum(g.aant_inw)) AS p_gehuwd,
round(100::numeric * sum(g.p_gescheid * g.aant_inw / 100::numeric) / sum(g.aant_inw)) AS p_gescheid,
round(100::numeric * sum(g.p_verweduw * g.aant_inw / 100::numeric) / sum(g.aant_inw)) AS p_verweduw,
round(100::numeric * sum(g.p_west_al * g.aant_inw / 100::numeric) / sum(g.aant_inw)) AS p_west_al,
round(100::numeric * sum(g.p_n_w_al * g.aant_inw / 100::numeric) / sum(g.aant_inw)) AS p_n_w_al,
round(100::numeric * sum(g.p_marokko * g.aant_inw / 100::numeric) / sum(g.aant_inw)) AS p_marokko,
round(100::numeric * sum(g.p_ant_aru * g.aant_inw / 100::numeric) / sum(g.aant_inw)) AS p_ant_aru,
round(100::numeric * sum(g.p_surinam * g.aant_inw / 100::numeric) / sum(g.aant_inw)) AS p_surinam,
round(100::numeric * sum(g.p_turkije * g.aant_inw / 100::numeric) / sum(g.aant_inw)) AS p_turkije,
round(100::numeric * sum(g.p_over_nw * g.aant_inw / 100::numeric) / sum(g.aant_inw)) AS p_over_nw,
sum(g.auto_tot) AS auto_tot,
sum(g.auto_hh) AS auto_hh,
sum(g.auto_land) AS auto_land,
sum(g.bedr_auto) AS bedr_auto,
sum(g.motor_2w) AS motor_2w,
sum(g.opp_tot) AS opp_tot,
sum(g.opp_land) AS opp_land,
sum(g.opp_water) AS opp_water
FROM regios r
LEFT JOIN gem_2013_v1 g ON r.gm_code::text = g.gm_code::text
WHERE g.water::text = 'NEE'::text
GROUP BY r.wmo_naam;
Specifically, note the following line, in which the geometries are unioned, and transformed to SRID 4326
(Web Mercator projection used by most web mapping engines):
st_union(st_transform(g.geom, 4326)) AS geom
Most other lines are just computing an average of existing properties.
Converting the Results to GeoJSON
GeoJSON is increasingly becoming more popular to display geometry data on the map, but it is only partially supported by PostGIS (i.e., you need to do some extra work to get it right). Fortunately, someone else already wrote a tutorial on this, and I only needed to change three things (marked in bold):
- I limit the number of digits to 6
- I've added the properties that I wished to export
- I've specified the view that I was using
Run the following query in your SQL window to convert the output to GeoJSON.
SELECT row_to_json(fc)
FROM (SELECT 'FeatureCollection' As type, array_to_json(array_agg(f)) As features
FROM (SELECT 'Feature' As type
, ST_AsGeoJSON(lg.geom, 6)::json As geometry
, row_to_json((SELECT l FROM (SELECT aant_gem, aant_inw, aant_man, aant_vrouw, _
p_00_14_jr, p_15_24_jr, p_25_44_jr, p_45_64_jr, p_65_eo_jr, p_ongehuwd, p_gehuwd, p_gescheid, _
p_verweduw, p_west_al, p_n_w_al, p_marokko, p_ant_aru, p_surinam, p_turkije, p_over_nw, _
auto_tot, auto_hh, auto_land, bedr_auto, motor_2w, opp_tot, opp_land, opp_water) As l
)) As properties
FROM wmo As lg ) As f ) As fc;
As the resulting GeoJSON is very big, it cannot be displayed in the regular SQL window output, so I've selected to output the results to file using the menu 'Query|Export to file', where I specified the file name, no quoting and no column names. All being well, you should end up with a very big (15MB) GeoJSON file.
Simplifying the GeoJSON Data
Clearly, a 15MB GeoJSON file is not going to work on a SmartPhone, and also regular browsers will not be very thrilled about it. Fortunately, there is a tool that can help us simplify the GeoJSON (or Shape or TopoJSON) file, mapshaper. Although you can use it online for free, I was not very keen on uploading 15MB, so I've installed the local version using the Node Package manager.
npm install –g mapshaper
And run it with:
?mapshaper-gui
It opens up a window allowing you to simplify your GeoJSON. Below, you see the before and after effect, and I could simplify the GeoJSON up to 0.34%.
The finally generated GeoJSON can be verified using another free online tool, GeoJSONLint, which was used to create the image at the top.
The final GeoJSON file size was reduced from 15MB to a mere 95KB, so that worked very well.