Other things on this site...

MCLD
music
Evolutionary sound
Listen to Flat Four Internet Radio
Learn about
The Molecules of HIV
MCLD
software
Make Oddmusic!
Make oddmusic!

In mainland Britain, you are never more than 34 miles from a pub.

In mainland Britain, you are never more than 34 miles from a pub.

This and other geo-factoids available from my new web service. (I've named it "Feet From A Rat" in tribute to this hoary old urban legend.)

Sunday 8th June 2014 | openstreetmap | Permalink

OpenStreetMap UK: what should we do this year?

As a contributor to OpenStreetMap, one thing I've been wondering recently is what sort of map data should we collect for the UK, now that the coverage has already got good. Since OpenStreetMap generally has great coverage of the UK, when you're out and about with a printed-out map and a pen, it's very rare that you can find much significant that isn't mapped already - sometimes a new street or a missing church. You could pour your time into mapping increasingly obscure things, whatever you're interested in. But what would be the most useful things to map in the UK, over the coming year? Things that are not just interesting to map but could be practically useful to people? Some thoughts:

  • Addresses. I kind of don't like mentioning this, because I find it boring to map addresses, and I'd much rather that the UK address data magically appeared from some big open-data source. But addresses are obviously really useful for so many things: routing, looking up shops, etc. Coincidentally, Simon Poole (chair of OSM Foundation) also says address collection is the thing we need, for OSM in general not just UK.
  • Postcodes. In the UK postcodes are really important for satnav routing etc. For some reason I suspect that collecting postcodes could be less mind-numbing as collecting addresses, but just as useful. See Jerry's blog about UK postcodes in OSM for an analysis of where we are with postcodes... about 3% of them. As he says, we need to do better than this - so how best to collect them?
  • Footpaths. Really important for planning walking routes, whether in the city or the countryside. We also need to mark when footpaths have steps or are otherwise no good for wheelchairs/prams. (It's also handy to know when footpaths are full-blown rights of way, or just "permissive" access.) In his speech at State Of The Map 2013, Peter Eastern mentioned that they estimated UK footpath data was still pretty incomplete. I often use OSM for planning walking routes - it has loads of footpaths that no other services have, but I do still often go walking somewhere and find new footpaths that aren't in there yet. I don't know how we could specifically push for more footpath mapping - all I will say is please help us and map walking routes :)

Some notes on other things which I'm not sure how vital they are:

  • Buildings. I know when we've been doing London mapping meet-ups, Harry Wood has mentioned that OSM's buildings coverage for London is rather patchy. You can see it on the map - there are pockets full of buildings mapped, and large pockets with none. But... is this a bad thing? What would we want buildings mapped for? I know they're useful in fancy 3D map renderings, but for more practical purposes...? I'm guessing it's not that crucial, though it might relate a bit to the address mapping.
  • Shops. It's great to have shops, restaurants, pubs and other local businesses in OSM. Once you start mapping these, though, you notice there's quite a rapid turnover - your high street probably gains/loses a shop every 3 months or so, at a wild guess. So this data is useful, but it's less permanent than all the other stuff I've mentioned so far. I'd suggest there's no point having a big push to map every shop in every high street, we just need to let the momentum build to a point where that happens under its own steam.
  • Postboxes. Again Jerry has a detailed breakdown, and says we need to map them more. Plus Robert Whittaker has some data mining tools about postbox completeness. On the other hand, is it really that urgent to map postboxes? It doesn't feel anywhere near as critical as mapping addresses, walking routes, etc. The only use case I can think of is "where's the nearest postbox?" which is rarely a critical matter.
  • GPX traces. After MapBox published their beautiful rainbow GPS map tiles which provide a lovely way to see the GPS traces contributed by the community, I noticed at least two villages where there were basically zero traces uploaded. Are GPS traces important to UK mapping? The coverage of the aerial imagery is good, and generally quite well GPS-aligned, so... do we need more GPS traces around the UK? I genuinely don't know, and would be interested to find out either way.
  • Grit bins. Something I noticed a couple of winters ago - it would be really handy to have every grit bin mapped: one day, when it's freezing cold outside, all the grit bins are hidden under a foot of snow, and you need to clear a driveway, it could be really handy. That's just one little thing that I don't think anyone has particularly focussed on, so a little call out - please map amenity=grit_bin when you see them!

I'd be grateful for any feedback on the thoughts above, including other things that could be priorities. Just one UK mapper's perspective.

Wednesday 1st January 2014 | openstreetmap | Permalink

Rejigging the OpenStreetMap browse page

On OpenStreetMap, I find the /browse/ pages really useful for getting a quick summary of an "object" in the map. It shows when it was edited, shows all the tags, etc.

However, I have two issues with it:

  • The use of space isn't ideal. There's plenty of unused space which I don't think is entirely deliberate (of course whitespace is good sometimes) - and the interesting information often gets pushed down below the fold as a result.
  • The browse pages have enough information that they should be generally useful, not just as a diagnostic tool for mappers, but maybe for people who want to share the details of the pub they're going to, or whatever. The main impediment to this is that the initial impact of the page is fairly unfriendly and technical.

I believe the layout can be rearranged in a way which doesn't remove any of the information that mappers need, but which makes the browse pages more accessible and friendly and hopefully generally useful. This would encourage more casual users to see the tags we have, and... fix them :)

So the main objectives are:

  • Make the main heading a bit more approachable, making the "name" (where available) a bit more primary than it currently is.
  • Make the "Tags" section a little bit more visually primary (more approachable to newcomers than changeset).
  • Make the "last edited" info more compact - it doesn't need to be a four-row tabulation, but can be as a sentence "Last edited [date] by [user] (version [v] in changeset [c])". It makes sense to put the "View history" link at the end of this too. Also, it's more approachable to have the last-edited-date converted to something like "2 months ago", and for full info it'd be good to have the full date tooltippy.
  • Try not to do anything that prevents experienced mappers from getting a visual overview of the more technical info, such as history, XML link, edit links etc.

Work so far is in my github branch called "browsepage". Here are some screenshots, in each case with "before" on the left and my version on the right:

A relation:

A way:

A node:

I really think the "Last edited N decades ago by Thor" is much more approachable than the current table of metadata. The other stuff I've done is less dramatic, but I like the way it gives a bit more priority to the tags and makes room for plenty of them in a screenful.

Update: someone asked if I could post how the pages look on small screens (i.e. phones) - here are screenshots, taken by resizing my Firefox window small enough that the small stylesheet kicks in:

Saturday 21st September 2013 | openstreetmap | Permalink

Diversity and OpenStreetMap

The big annual meetup of OpenStreetMap folks was last week and it was full of interesting talks. The diversity of people seemed pretty good relative to a lot of the meetups I end up at (open-source software, experimental music, computer science, you know, that kind of thing), but still, the OSM community needs to work towards being more representative of people in general.

In her keynote on diversity, Alyssa Wright gave a telling example, of how a proposal for a "childcare" tag had been voted down, primarily because the people who voted felt unconvinced that it wasn't already covered by the "kindergarten" tag. Alyssa contrasted this with the slightly bizarre plurality of tags for things that traditionally have male associations (e.g. pub, bar, nightclub, stripclub, brothel, each of which have separate amenity tags).

Now, this is a fairly anecdotal contrast, and Alyssa said so herself. (In other slides she showed some statistics which make the point more numerically.) But it illustrates some of the ways in which diversity issues come into play in open wiki-like projects. Maybe the existence of both "pub" and "bar" tags is a weird historical glitch which no-one particularly agrees with (I certainly don't see the point!). That doesn't detract from the fact that there's always going to be some sort of bias built in to OSM's norms, and people who absorb themselves into OSM will absorb and reproduce the norms, and this can be a self-reinforcing problem unless we pay attention to fixing it.

In this post I'm not going to summarise everything that everyone said about diversity. I'm just going to list some of the take-home messages that I got from this strand of talks:

  • "Diversity" relates to many things of course - gender, age, nationality, etc etc etc. Alyssa acknowledged this but said that fixing gender diversity in a community is the fastest and clearest route to fixing diversity in general in a community. This has a definite ring of truth to me. It'd help to focus efforts.

  • Yuwei Lin recommended that project-based mapping was a good idea - from her research it would be a mode of engagement that would work well for women. She suggested examples: the humanitarian OSM team projects, as well as mapping parties to do specific purposeful things such as zoo mapping, mapping of National Trust sites, etc - all sounds good to me.

  • "Measure excellence by teaching" (said Alyssa). This sounds like good advice, especially in the context of a kind-of-techy community like this one, where discussions about GIS systems or web servers can lead to a tendency to measure excellence by fairly techy measures. Teaching is flipping critical to a project like OpenStreetMap, whose success or failure must lie in how well its dedicated "in-group" helps people from outside to engage.

  • "Bikeshedding is normal" said Frederick Ramm, summarising one tendency in OpenStreetMap's mailing lists. I know bikeshedding is pretty much an inevitable fact of organised discussion, but I do fear that it can put off potential (or existing) community members, and I wonder how to arrange things so that unnecessary bikeshedding is truncated...

  • "Stop talking, start mediating" said Alyssa, in her closing recommendations. Sounds like general good advice. (Relates to bikeshedding? Maybe, dunno.)

  • Yuwei recommended diversity-friendly social events. For example the OSM London meetings are always brief mapping parties followed by pub drinking in the mid-to-late evening. Nothing wrong with it in itself, but it could easily be offputting for people who don't drink (e.g. for religious reasons), or have childcare commitments, etc - probably wise to vary the events a bit? A Saturday afternoon in a tea-room would be nice (I know a good one or two).

  • I did notice in one talk, there was a little bit of a tendency to equate female mappers with newbie mappers. Let's not make that mistake! I don't think anyone was stuck on that point, just thought I'd mention it since I noticed it.

  • Frederick talked about the different OSM mailing lists, and he mentioned all the different country-specific mailing lists, each of which uses their national language. He gave an interesting example in which three different communities each came upon a particular topic, but independently and at different times. This made me wonder if this setup, with a "cluster" of semi-independent communities rather than one big community lumped together on a single universal mailing list, was in fact a good way to promote diversity and reduce the impact of self-reinforcing social loops. I wonder, should we de-emphasise the idea of a "main" mailing list or IRC or whatever? A half-formed thought to finish the list with.

I didn't actually end up chatting to most of the people I've mentioned just above, so I haven't really talked any of this stuff through with them. Lucky that there are good people on the case already, so it seems. OpenStreetMap has a diversity-talk mailing list if you'd like to get involved.

Tuesday 10th September 2013 | openstreetmap | Permalink

OpenStreetMap: animated dataviz of edits per year

Another iteration of my visualisation of OpenStreetMap edits - here's an animation showing, for each year 2005-2012, the density of edits according to their geographic location:

Animation showing the density of OSM edits in 2005-2012, divided by population density.
Animation showing the density of OSM edits in 2005-2012, divided by population density.

The upper plot is the raw edit density. The lower one (which I think is more illuminating) is the edit density per unit population, as described in a previous post (with source code).

So what can you see? Well, both of them show the humble London-centred beginnings in 2005, followed by solid growth until the whole world is filled out. I think the lower plot more clearly shows when the "filling out" happens. 2007 is the year OpenStreetMap "goes global" but 2009 is the year it levels out. Before 2009, the edits-per-population are very variable, but from 2009 onwards the picture is much whiter and there's not much annual change in the colouring. This means the distribution of edits much more closely fits the population distribution, though (as noted last time) central Africa and around China are relatively underrepresented.

Thursday 17th January 2013 | openstreetmap | Permalink

OpenStreetMap: where should the next recruitment drive be?

I watched the fancy OpenStreetMap Year of Edits 2012 video, which shows a data-driven animation of all the map edits happening around the world from thousands of contributors. It certainly makes the project look busy!

BUT it's not the kind of data-viz that particularly wants you to understand the data. If you watch the video, can you tell which was the busiest part of the world? Which bit was least busy -- where should OSM's next recruitment drive be?

So here's what I wondered: can we visualise the density of map edits for a place, relative to the population of the place? You see, if we assume that the population density of one part of the world should be roughly proportional to the number of things-that-should-be-mapped in that part of the world, then a low value of this ratio (edit rate divided by population density) indicates a place that needs more mapping.

So how to do it? I downloaded the OSM changesets from http://planet.osm.org/ and piled up all the bounding boxes from the 2012 changesets, converting that into a grid giving the edit density. Then I was lucky enough to find this gridded world population density data download from Columbia University.

Then I wrote a Python script to divide one by the other and plot the result. Here it is:

Untitled

Blue areas have a relatively high number of edits per head of population, red have relatively low. White is average.

(BTW, here's the plot of the edit density, before taking the ratio.)

This is only a rough sketch, since it relies on some assumptions (e.g. every "changeset" was an equally important edit; also the map-features-per-population assumption I already mentioned). But the general story is: we need more mappers in South-East Asia (especially China) and Africa, please!

The plot clearly shows a general pattern connected with relative wealth / access to tech, so maybe initiatives like operation cowboy are the way to do it - get places mapped on behalf of others.

Tuesday 8th January 2013 | openstreetmap | Permalink

Data visualisation of pubs in UK & Eire

OK, if you want to know where in the country has good pubs, how do you do it? Well, here's what I do: download a data extract of all the pubs in the UK/Eire from OpenStreetMap, and use density estimation to look at the distribution of pub attributes such as whether it serves real ale, or food, or has wifi. That's the normal way, right?

I've put online my code for analysing pub data distributions in OpenStreetMap. Now let's look at the plots. First of all here's a simple plot of all the pubs in the land:

pubdensity

So far so good. Some obvious urban dense spots in there. But what about real ale, for example? Which parts of the country have a stronger representation of real ale pubs than average? We can show this by finding the density of pubs marked as real_ale around the country, then dividing it by the overall density of pubs. If this ratio is higher than average we'll mark it blue, if lower than average it's red, and if it's exactly average it's grey. Here are the results:

pubdensityratio_realale

(By the way, you can see that my country boundaries are really sloppy. I haven't bothered to tell the script exactly where the coastline is, so the edge just occurs where the estimated pub density falls below an arbitrary threshold.)

There are some quite clear tendencies in there. If you want real ale, go to the Manchester/Derbyshire/Yorkshire area, or Norfolk. (And maybe Cardiff's not too bad.)

I should say right away that the source data might be skewed. It comes from OpenStreetMap where the data is contributed by volunteers, and the volunteers decide what information they want to add. It's entirely possible that in some parts of the country, there are dedicated bands of people who like to contribute real-ale info, whereas in other parts of the country they're more interested in logging the number of toilets. For proof of this, see the crazy distribution I get for relative availability of toilets, clearly influenced by some sort of London-region push to log that data:

pubdensityratio_toilets

So in all of this analysis, the patterns are probably a combination of the true picture and contributor bias. The picture for real ale does reflect some folk wisdom so I think it's relatively well-mapped (though only 3% of pubs overall have the real_ale tag, which is not many).

Looking back at that real-ale plot - what about London? It has a massive density of pubs, as the top plot shows, but according to this view it's moderate-to-poor for real ale. Well, it's probably the high density of pubs that's working against it here. If you look at the raw density of real-ale pubs, then London is actually the third-biggest peak. So this means that in London, we can make two statements which feel contradictory but aren't: (1) in London your nearest real-ale pub is generally closer than if you were elsewhere; but (2) if you walk into the nearest pub you're not so likely to find real ale. (I don't honestly know if these are true, so I'll remind you of the caveat about trusting the data.)

OK, what about other features, such as food or wifi? The plots tell a different story than for the real ale:

pubdensityratio_foodpubdensityratio_wifi

For both of these, the tendency is for the blue (positive) regions to be outside the urban centres, spread around the countryside. For food, this makes a kind of sense: country pubs often have food, while in the cities you can go elsewhere for food. But for wifi? I would have thought it was more common in cities. Again it's possible that urban OSM mappers haven't logged these data; would be good to re-run these statistics when the data is more complete.

(Also note that the wifi densities tend to be quite low in exactly the regions where the real-ale density was notably high. A curious anticorrelation - is this something about regional differences in the general nature of pubs...?)

One kind of data that is nice and complete is pub names. So, for example, here's the relative density of pubs whose names begin with "The " or "Ye " (or the Welsh "Y " or "Yr "):

pubdensityratio_the

I didn't expect a very strong trend here, and indeed that's what you get: the washed-out colours show the differences are not massive. But it's a bit of an Anglo tendency to prefix the pub name with a definite article. So, then, what about the hyperquaint subset of names beginning with "Ye olde" (or "Ye old")? Here's the plot:

pubdensityratio_yeold

You can see from the stronger colours that the tendency is more pronounced - it seems "Ye olde" pubs tend to be found not so much in Scotland/Eire, but in rural England/Wales.

Monday 31st December 2012 | openstreetmap | Permalink

Pub statistics for UK and Eire

I just extracted all the pubs in UK & Eire from OpenStreetMap. (Tech tip: XAPI URL builder makes it easy.)

There are 32,822 pubs listed. (As someone pointed out, that's 38.4% of all the pubs in OSM. So the UK is doing well - but come on, rest of the world, get mapping yr pubs ;)

A handful of quick statistics from the data I extracted:

  • The real_ale tag indicates 1080 real ale pubs (the tag is blank for 31678 of them, "no" for 64 of them). That's 3%, probably much less than the true number.
  • The toilets tag indicates 1211 have toilets available - again about 3%, whereas I bet most of them do really!
  • The food tag shows food available at 1119 of them (31686 blank, 17 "no"). Again about 3%, gotta be more than this.
  • The wifi tag shows wifi available at 274 of them (32450 blank, 98 "no"). I've no idea how common wifi is in pubs these days.
Thursday 20th December 2012 | openstreetmap | Permalink
Creative Commons License
Dan's blog articles may be re-used under the Creative Commons Attribution-Noncommercial-Share Alike 2.5 License. Click the link to see what that means...