Monday 24 August 2015

Department for Transport

The Department for Transport releases road traffic estimates for Great Britain on an annual and quarterly basis, providing AADFs, or the yearly average of the number of vehicles passing a point on the road network each day. Data is collected by the Department for Transport either by manual counts from trained enumerators or automated traffic counters, with both methods classifying the traffic by type. The data for the sample period is then expanded into flow figures for publication. The visualisation displays the road use by vehicle type, as a number of small multiples, with the main image displaying the totals for the whole of Great Britain.

According to the Department for Transport, road traffic for all vehicle types rose in 2013 by 0.4% when compared to the previous ten years. All vehicle types showed a decrease over the same period, with a significant drop of 11.2% in the traffic of Heavy Goods Vehicles (HGVs), the only exception being a major increase of Light Goods Vehicles (LGVs) at 19.4%. The network of motorways and ‘A’ roads only makes up 2.4% of the road network, but carried 32.9% of all motor vehicle traffic and 65.6% of all HGV traffic (Department for Transport, 2014). A similar trend has been observed in London, where road traffic shows a decline in all vehicle types, with the exception of buses, coaches and cycling.

The Department for Transport also helpfully provide a shapefile with the check point attribute for joining to the raw data. Five classes were used in the classification, with graduated colours from dark blue to white, as well as graduated line weights, with a dark background colour to highlight the most travelled roads.

At the time of writing, graduated line thickness is not well supported in the QGIS d3MapRender plugin. However, to continue to visualise this with the use of plugin, without any zooming, could easily be done with the be done at this point with some tinkering the in the color.css stylesheet to restore the line thickness.

The data for each year's AADF had been joined to the road shapefile, so adding a popup with a basic spline chart to visualise the changes in traffic volumes at each check point would be a nice addition. To do this on the Viz tab of the plugin, simply select each years AADF attribute, click the add button, supply the name of "AADF" in the resulting dialogue, and then supply a set of labels (years 2000 through to 2013) for the x-axis.

Average Annual Daily Flows from the Department of Transport. Pan and zoom around the map, click on a polygon to view the details of the AADF.

However, with the addition of zooming, the borders of polygons (or width of polylines) are rescaled at each zoom level. To do this for each individual element of a layer would be too costly. Instead the plugin rescales the borders of polygons and width of polylines at the level of the entire QGIS layer. Graduated line thicknesses are therefore problematic when zooming. As a workaround to this issue, there are two choices:

  1. Remove the JavaScript that performs the re-scaling (preferred method) and manually patch up the color.css file
        // Zoom/pan 
        function onZoom() {
          hideTip();
    
          vectors.attr("transform", "translate("
            + d3.event.translate
            + ")scale(" + d3.event.scale + ")");
    
          /*vector0.style("stroke-width", 0.26 / d3.event.scale);*/
    
        }
    
  2. Or, duplicate the original layer five times (for each symbol in the graduated classification scheme) within QGIS and filter each duplicated layer to display just the range of features for a given symbol in the chosen graduated classification. Then all five layers can be exported with the layer with the greatest geographic range chosen as the main layer. The plugin maintains the filter applied to each layer during the export. Of course, that introduces problems with binding popup information as none of the layers form the complete dataset…

Barnacle Geese Migration

Who doesn't like wildlife data? Also, the point and polyline data provides a means of testing the support of these data types in the QGIS d3MapRenderer plugin.

The MoveBank website stores animal tracking data which can be downloaded (subject to terms and conditions) as shapefiles for use within QGIS. The data for Migration timing in Barnacle Geese used in studies by Kölzsch et al. The data has been published by the Movebank Data Repository with DOI 10.5441/001/1.ps244r11. See www.datarepository.movebank.org/handle/10255/move.385. For details about the project, data use etc. please contact the PI: michael.exo@ifv-vogelwarte.de.

The data comes in six separate shapefiles, for points of timestamped GPS coordinates, and polylines linking the individuals being tracked from the Barents Sea, Greenland and Svalbard studies. After merging the point data into a single shapefile, and repeating this process for the polylines, the plugin can be used to export this data to a d3 web map along with a suitable Europe outline. The QGIS project background colour is also retrieved by the plugin and used as the background for the entire visualisation.

Note: Merging the data is not strictly necessary, as the plugin would be happy with six shapefiles. However, this allowed the line data to be selected as the main layer during the export. The main layer is used to set the initial extent of the visualisation, though the user may pan (if the pan and zoom option is selected) to view parts of the original shapefiles which were hidden.

The resulting visualisation used the Lambert Conic Conformal projection. The plugin calculates the center point, and extent from the maximum bounds of all the shapefiles included in the export. For conic projections in d3 the parallels parameter sets the top and bottom (latitude extents) of the projection, and the “rotate” and “center” parameters set the central longitude and latitude respectively. Again the plugin calculates this from the maximum bounds of all the shapefiles.

Barnacle Geese Migration tracks. Pan and zoom around the map.

Glasgow Health Inequalities

To further look at health inequalities and test the QGIS d3MapRenderer plugin further, the Scottish Index of Multiple Deprivation health domain was downloaded from the Scottish Government Website along with vector layers for Scottish Neighbourhood Statistics data zones, administrative and parliamentary boundaries from the UK Data Service Census Support Website.

The health domain rank is comprised of a number of indicators, namely: a standardised mortality ratio; hospital episodes related to alcohol use; hospital episodes related to drug use; comparative illness factor; emergency admissions to hospital; proportion of population being prescribed drugs for anxiety, depression or psychosis; and, proportion of live singleton births of low birth weight. The SIMD then weights each indicator to produce a combined rank for the health of the population at the level of Scottish Neighbourhood Statistics data zones. The SIMD health domain data was imported as a delimited text layer into QGIS, joined to the Scottish Neighbourhood Statistics data zones before being clipped to the boundary of Glasgow.

Building outlines for Glasgow were downloaded from the Ordnance Survey OS Open data VectorMap™ and were used to clip the data further so that only the built up areas of Glasgow are visible. Although the building outlines will include warehouses, offices and other large buildings, this is an easy way to overcome the problems of unstandardized choropleth maps relating to human geography where physically large but sparsely populated areas dominate the map. The SIMD health domain rank was then used to create a graduated colours map with ten categories and a quantile classification method, a method often used in epidemiology. This method allows comparisons to be made across the city rather than the health rank within the entire country.

The plugin can then be used to export the building outlines, with a legend and tooltip. No simplification is used during the export as this would introduce distortion of the buildings. A quick alteration of the tooltip html and the map is ready to upload onto a web server.

Deciles of health rank for Glasgow, with data from the Scottish Index of Multiple Deprivation, 2012. Contains Ordnance Survey data © Crown copyright and database right 2013. Pan and zoom around the map, click on a building to view the health domain data.

Sunday 23 August 2015

d3MapRenderer v0.8 released

v0.8 of the d3MapRenderer plugin to QGIS released.

Please visit the plugin homepage for details on how to install the pre-requisites of Node.js and the topojson package.

Unicode support added. Validation added to check the output directory exists.

Sub-directory for output files now created with a time stamped name.

Linux now supports non-ASCII chars in the output directory, title, legend, popup information and charts.

On Windows the subprocess command on python 2.7 creates a Windows command prompt which will not accept unicode characters (it has it's own limited encoding). Therefore on Windows only, the output directory is limited to ascii characters. Title, legend, popup information and charts support unicode.

Just to test out these unicode changes, here is a map of the administrative world regions from the Natural Earth, data in the Mollweide projection. No attempt has been made to alter the default colours from QGIS, or the resulting tooltip template. Pan and zoom around the map, click on a polygon to view the names and alternatives from the Natural Earth dataset.


Now could some brave soul with a Mac let me know if the plugin works? I haven't got a Mac to test it on...

Thursday 20 August 2015

Manhattan Building Heights

As part of work to stress test the implementation of d3.js in the plugin, the MapPluto tax lot database from the Department of City Planning City of New York with almost 43000 records seems like a good choice. As the data details building footprints it would be unwise to use the simplification methods built into the d3 (or any simplification method), so this should result in a large topojson file. The database merges land use and geographic data with features from the Digital Tax Map. The metadata is also excellent, so you aren't wondering what each attribute is supposed to represent.

Although some attributes are incomplete the number of floors is recorded for all features. To visualise the height of the buildings it should be possible to use the number of floors from the dataset and apply some logic to figure out building height. Coupled with the building use classification it is possible to use the Council on Tall Buildings and Urban habitat guidelines for calculating building height. A bit of head scratching and flicking through the MapPluto metadata resulted in a SQL fragment for use in the QGIS Field Calculator:

CASE WHEN substr(coalesce("BldgClass", 'X'), 0, 1) = 'O' THEN 
 (3.9 * toint("NumFloors")) + 11.7 + (3.9 * (toint( "NumFloors") / 20))
WHEN substr(coalesce("BldgClass", 'X'), 0, 1) IN ('A', 'B', 'C', 'D') THEN
 (3.1 * toint("NumFloors")) + 7.75 + (1.55 * (toint( "NumFloors") / 30))
ELSE 
 (3.5 * toint("NumFloors")) + 9.625 + (2.625 * (toint( "NumFloors") / 25))
END

Although this algorithm seemed the most accurate when comparing a randomly selected set of buildings with the online records, it is not perfect. There is no way of knowing the height of the building's spire, which accounts for quite a large proportion of the tallest buildings. In the end I manually "fixed up" the six buildings I knew were actually over 300m (because I wanted to highlight them), but this leaves me feeling somewhat dirty.

For the classification by building height, natural breaks in a graduated scheme was used, whilst the penultimate class boundary was overridden at 300 meters, and a gothic colour scheme with a highlight of red for the tallest buildings. For the background, the New York City Coastal Boundary and a United States Census TIGER/Line Shapefile for New Jersey work well.

Manhattan building heights in metres. Pan and zoom around the map, click on a polygon to view the details of the building.

After exporting via the plugin choosing options for a legend and information popup, a quick change is all that is needed to the tooltip template before the results are published online.

Obesity in the USA (2015)

In order to test the integration with the c3.js charting library with something more interesting than the basic line or bar charts (not that there’s anything wrong with these charts), and to return to visualising health data in some way, which is partly the premise of the dissertation, the excellent open data set produced by County Health Rankings & Roadmaps for the United States was downloaded.

An oft reported problem in the United States is increasing obesity with reports in the Guardian backed up by various official sources on the internet from the Centres for Disease Control and Prevention and National Institute of Diabetes and Digestive and Kidney Diseases amongst others, all predicting increasing health problems for those with obesity.

Deciles of obesity, by United States county, for the year 2015. Pan and zoom around the map, click on a polygon to view the percentage of adult obesity within the county.

The County Health rankings data includes the Federal Information Processing Standards county code which uniquely identifies counties within the United States. This makes it a simple process to join the data within QGIS to one of the County Cartographic Boundary Shapefiles from the United States Census Bureau.

Did I say simple? The only problem with numeric codes starting with a zero, is that during the import of the data into QGIS (and another leading GIS provider) the code is converted to a numeric field and the leading zero is lost. No number starts with a zero other than zero itself. The county of Autauga, Alabama now has a code of 1001 instead of 01001, and any attempt to join the data to the GEOID field in the shapefile from the Census Bureau will result in gaps in the resulting data set. There may be other ways to fix or avoid this problem, but simply adding a new text field with the QGIS Field Calculator which pads the data is the approach I took. The imported CSV file and the county boundaries can then be "simply" joined.

For reference the SQL fragment used to achieve the padding is as follows:

lpad(  tostring( "FIPS" ), 5, '0')

Classifying the layer with a graduated quantile (equal count) method, by the obesity attribute, into 10 classes splits the data into equal deciles. The addition of a blue and red colour scheme from colorbrewer makes for a nice looking map.

The final map was created using the plugin, d3's Albers USA projection, adding the county and state attributes to the popup information, and choosing a Gauge chart type. This c3.js chart type expects a single attribute in the data range, and that attribute is expected to be a percentage by default, though you can change this behaviour. Otherwise strange results will be visualised. The percentage of Adult obesity for each county is already present in the dataset and therefore used in the export. No field calculations necessary.

After the plugin has finished the export the tooltip template can be tidied to remove the field names and replace the html table with a simple div element. Finally, the colours of the gauge can be changed according to the data value, rather than the default colour, by adding a pattern and threshold to the JavaScript which replicates html colour codes from colorbrewer and the top value for each class in the graduated style used within QGIS:

color: { 
  pattern: ['#053061', '#2166ac', '#4393c3', '#92c5de', '#d1e5f0', '#fddbc7', '#f4a582', '#d6604d', '#b2182b', '#67001f'], 
  threshold: { 
    values: [24.99, 27.99, 28.99, 29.99, 30.99, 31.99, 32.99, 33.99, 35.99, 48] 
  } 
},

Although the data has not been standardised, the most obese counties can be easily picked out.

Wednesday 19 August 2015

Supermarket Voronoi

During the course of the MSc I came across this excellent open source dataset from Geolytix containing the locations of supermarkets within England, Scotland, Wales and Northern Ireland. I've used a previous version of this dataset to create Voronoi diagrams, and it seemed natural to use it to test a QGIS plugin I've been writing as part of the dissertation.

The plugin converts shapefiles to topojson and creates all the necessary boilerplate code required for a d3 map on a html page. Although d3 comes armed with its own Voronoi functions, I want the plugin to take the carefully crafted styling within QGIS and output it to d3, taking advantage of the SVG functionality in modern web browsers.

For those that don't know Voronoi polygons are the area closest to each point in Euclidean (straight line) space for a given set of points.

So, once the data is downloaded and imported into QGIS as a delimited text layer, the data can be processed using the "Voronoi polygons" function in QGIS's Geometry toolset.

The new layer returned by the function has polygons for each supermarket location, the outline of the country and major cities clearly visible with the concentration of polygons in urban areas, and polygons at the coastline extending into infinity at the edge of the map.

Using a data set of United Kingdom regions, those of England, Scotland and Wales were dissolved to form a single outline of the United Kingdom mainland with which to clip the Voronoi layer, resulting in polygons restricted to the coastline.

Categorising the Voronoi diagram according to retailer, and using a combination of color brewer colour schemes (due to the number of retailers in the data set) a reasonable map can be created in a short space of time. Some colour changes to the original voronoi diagram makes for a nice background to the categorised map.

Creating a d3 map of this data using the plugin, whilst specifying a legend and popup information, can be accomplished within minutes. All that remains is to tidy up the tooltip template html so that the attribute names are replaced with more understandable words.

Over twenty thousand voronoi polygons are rendered in the browser (there's two layers including the background). An experiment to add the QGIS point layer from the original data import caused a substantial delay on rendering, zooming and panning around the map on an Android device. Unsurprising given the volume of data.

Must remember to get around to comparing this to d3’s built in d3.geom.voronoi client-side rendering, once the dissertation is out the way.

Pan and zoom around the map, click on a polygon to view the details of the supermarket.

Wednesday 12 August 2015

Open Flights

WARNING: Large data set. Will require something with a decent processor to render.

Largely because I'm procrastinating, rather than writing up my dissertation, I've spent the afternoon prototyping the client side changes required to support orthographic projections in the d3MapRenderer plugin.

What better to trial it with than the OpenFlights data? It will also push the boundaries of what is possible. Download the data and add a Geom column with a linestring following Alasdair's guidelines, add the country shapefile from Natural Earth and then use the plugin to export the data with a simple projection such as Winkel Tripel.

OpenFlights data in the Winkel Tripel projection with Natural Earth land data. Pan and zoom around the map.

Ok, so lets try and get this into a globe based on Mike Bostock's Interactive Orthographic example. With the OpenFlights example I want to see clearly what happens to the flight paths (just out of interest) so I'll also get rid of the background as well (for now). The first thing to change is to add a new JavaScript library:

<script src="js/d3.geo.zoom.js"></script>

Then we need to change the projection, which now becomes:

//Projection
var projection = d3.geo.orthographic()
.scale(250)
.translate([width / 2, height / 2])
.clipAngle(90);

The scale will need to be chosen carefully, in order to fill the containing div element to achieve the desired effect. To that end, get rid of the re-projecting that occurs further down the JavaScript. The d3MapRenderer plugin uses this to set the scale and translation of the main layer to fit nicely in the container. Not necessary in this prototype, it will just get in the way, so remove the following lines of code:

// Refine projection
var b, s, t;
projection.scale(1).translate([0, 0]);
var b = path.bounds(object2);
var s = .95 / Math.max((b[1][0] - b[0][0]) / width, (b[1][1] - b[0][1]) / height);
var t = [(width - s * (b[1][0] + b[0][0])) / 2, (height - s * (b[1][1] + b[0][1])) / 2];
projection.scale(s).translate(t);

Now, new variables and functions are needed as we're going to use d3.geo.zoom instead of d3.behaviour.zoom. The new variables are as follows:

var λ = d3.scale.linear()
  .domain([0, width])
  .range([-180, 180]);

var φ = d3.scale.linear()
  .domain([0, height])
  .range([90, -90]);

The zoom behaviour is altered to:

svg.call(d3.geo.zoom()
  .projection(projection)
  .on("zoom", onZoom));

Differing slightly from the behaviour in Mike's example, I want the user to grab the globe and rotate it, rather than the mouse simply pass over it. So we need a behaviour adding to the code to achieve this:

var dragging = 0;
svg.on("mousedown", function() { dragging = 1 })
  .on("mouseup", function() { dragging = 0 })
  .on("mousemove", function() {
    if(dragging == 1){
      var p = d3.mouse(this);
      projection.rotate([λ(p[0]), φ(p[1])]);
      svg.selectAll("path").attr("d", path);
      d3.event.preventDefault(); // disable text dragging
    }

The final change required is to simplify the zoom function, and get it to redraw the layer rather than trying to re-scale all the flight paths.

// Zoom/pan 
function onZoom() {
  svg.selectAll("path").attr("d", path);   
}

This then displays the flights draped around a sphere, which can be rotated and zoomed into. However, the result is somewhat laggy, difficult to control and not particularly satisfactory. Certainly this is due to the volume of data in the OpenFlights dataset, as Mike's original example is quite smooth.

OpenFlights data in the Orthographic projection, with the land mass removed. Pan and zoom around the globe.

A second attempt is needed. Jason Davies' Rotate the World example has a slightly different mechanism defining a new drag behaviour and method of setting the rotation angle, which essentially replaces the variables λ and φ along with my attempts rotating the projection. It is replaced by:

var drag = d3.behavior.drag()
  .on("drag", function() {
  for (var i = 0; i < projections_.length; ++i) {
    var projection = projections_[i],
    angle = rotate(projection.rotate());
    projection.rotate(angle.rotate);
  }
d3.select("#rotations")
  .selectAll("svg").each(function(d) {
    d3.select(this).selectAll("path").attr("d", d.path);
  });
});
 
function rotate(rotate) { var angle = update(rotate); return {angle: angle, rotate: rotate}; }

vectors.selectAll(".overlay").call(drag);

This is much easier to control, but still suffers from lag due to the amount of OpenFlights data. This is looking like a good template for adding Orthographic projections to the d3MapRenderer plugin in a later version.

OpenFlights data in the Orthographic projection, with the land mass put back and a better control mechanism. Pan and zoom around the globe.