OSM data hacking for fun and profit - Part 2

Last time, we looked at how to get data from OSM, import it into PostgreSQL + PostGIS for querying. As well an overview of how to find some basic data and what some of the main tables contain.

Today, we will continue on with our journey and take a look at how we connect the results of our OSM analysis to a simple visualization powered by Leaflet.

The (basic) result should look like the following:

Data Extraction

Last time, we talked about some methods of extracting all railway + stations around the SF bay area, and imported data into the table osmfun. Today, we will continue and actually write the query to extract the data.

In order to extract all the data we would need, we can run the following query:

select osm_id, name, operator, ST_AsGeoJSON(ST_Transform(way, 4326)) as geojson
from planet_osm_point
where railway = 'station'
select osm_id, name, operator, ST_AsGeoJSON(ST_Transform(way, 4326)) as geojson
from planet_osm_polygon
where railway = 'station'
select osm_id, name, operator, ST_AsGeoJSON(ST_Transform(way, 4326)) as geojson
from planet_osm_line
where railway = 'rail'
select osm_id, name, operator, ST_AsGeoJSON(ST_Transform(way, 4326)) as geojson
from planet_osm_line
where railway = 'subway'

There are several features of the above query that may need a bit of explanation. If you are not familiar with PostGIS, two functions will look strange. ST_Transform transform the SRID of the geometry to one which we will standardize our geospatial data on (4326). ST_AsGeoJSON transform the internal geometry representation to a popular JSON based exchange format called GeoJSON. Both functions are extremely useful, otherwise we will have to do quite a bit of work in a programming language to get the data to where we want it.

The first query (before the first UNION) grabs the station data that are less detailed/smaller (designated as a point on a map, which is basically a long/lat pair); The second query (between the first and second UNION) gets the bigger stations (a polygon on a map), and the third query grabs the entire rail line dataset.

There are several ways of exporting the data in a reasonable format (such as CSV). If you are using psql command line tool, take a look at the COPY command. If you are using a graphical SQL tool, there should be ways of exporting the data in CSV you can look up.

Working with Leaflet

Now that we have our data extracted, we can finally start working with it in Leaflet.

Data Formatting

Since Leaflet expects GeoJSON, and we have CSV with GeoJSON as a string field, we need to parse the CSV and convert it into the right format.

For loading CSV via AJAX, we will use D3. We will not be using any other features of D3 in this article, and we can use a more specialist parser such as PapaParse. Using D3 is helpful because for more elaborate visualization, learning and using D3 is extremely powerful and helpful.

The following code should be able to grab the data file + convert the data into the correct format:

d3.csv('sf_train_osm.csv', function(datain){
  var data = {
    type: 'FeatureCollection',
    features: datain.map(function(d){
      var gj = JSON.parse(d.geojson)
        var geojson = {
          geometry: gj,
          properties: d,
          id: d.osm_id,
          type: 'Feature'
      return geojson

Basic map display

For displaying map tiles, we have several options.

First and the recommended option is to use a hosted service such as Mapbox. The service give you a good balance between good default color palette, easy integration, and useful free tier.

For demos and playing around, you can also use OSM tiles. However, the tiles are subject to a excess usage policy, and the default colors are less than ideal for overlaying information on top.

The last and cheapest and the most time consuming option is to run your own tile server. Which you can do by investigating Mapnik.

For our purposes, we will use Mapbox since it has a very nice light theme. Integrating the tiles with Leaflet should be very simple:

var tok = 'YOUR-TOKEN-HERE';
var mapboxTiles = L.tileLayer('https://api.mapbox.com/v4/mapbox.light/{z}/{x}/{y}.png?access_token=' + tok, {
  attribution: '© <a href="https://www.mapbox.com/map-feedback/">Mapbox</a> © <a href="http://www.openstreetmap.org/copyright">OpenStreetMap</a>'

var map = L.map('map')
  .setView([42.3610, -71.0587], 15);

basic map

Overlaying the data

For overlaying the data, leaflet provides an handy layer type called geoJson. Which understands how to convert geojson objects such as Polygon, LineString and Points to corresponding svg elements. The code to convert Geojson to SVG is as follows:

var geojsonMarkerOptions = {
  radius: 8,
  fillColor: "#ff7800",
  color: "#000",
  weight: 1,
  opacity: 1,
  fillOpacity: 0.8
L.geoJson(data, {
  style: {
    weight: 2,
    opacity: 1
  pointToLayer: function (feature, latlng) {
    return L.circleMarker(latlng, geojsonMarkerOptions);

In particular, pointToLayer is special as it treats single points as a special case. Since everything else gets turned into a path element, but for a point, you get to control how it converts into a SVG element.

leaflet overlay

As seen above, all the railway line is represented, and the station points are all represented as orange circles.

You can see the full result here.

That’s it for this week! The result is still fairly basic, and there are many more things that could be done, such as cleaning up the railway data if you’d like to display a more schematic view of the rail lines. There are also some under construction stations, as well as tram line stations mistakenly displayed.