All maps are inaccurate but some have very useful applications: Thoughts on Complex Social Surveys

Vernon Gayle, University of Edinburgh

rudi_cafe-copy

This blog post provides some thoughts on analysing data from complex social surveys, but I will begin with an extended analogy about maps.

All maps are inaccurate. Orienteering is a sport that requires navigational skills to move (usually running) from point to point in diverse and often unfamiliar terrain. It would be ridiculous to attempt to compete in an orienteering event using a road map drawn on a scale of 1:250,000, this is because 1 cm of the map represents 2.5 kilometres. Similarly it would be inappropriate to drive from Edinburgh to London using orienteering maps which are commonly drawn on a scale of 1:15,000. On an orienteering map 1 cm represents 150 metres of land.

Hillwalking is a popular pastime in Scotland. Despite having similar aims many hillwalkers use the standard Ordnance Survey (OS) 1:50,000 map (the Landranger Series) but others prefer the 1:25,000 OS map. These maps are not completely accurate but they have useful applications for the hillwalker. For some hillwalking excursions the extra detail offered by the 1:25,000 map is useful. For other journeys the extra detail is superfluous and having coverage of a larger geographical area is more useful. When possible I prefer to use the Harvey’s 1:25,000 Superwalker maps. This is because they are printed on waterproof paper and they tend to cover whole geographic areas so walks are usually contained on a single map. I also find the colour scheme helpful in distinguishing features (especially forests and farmland), and the enlargements (for example the 1:12,500 chart of the Aonagh Egach Ridge on the reverse of the Glen Coe map) aid navigation in difficult terrain.

The London Underground (or Tube) map is probably one of the best known schematic maps. It was designed by Harry Beck in 1931. Beck realised that because the network ran underground, the physical locations of the stations were largely irrelevant to a passenger who simply wanted to know how to get from one station to another. Therefore only the topology of the train route mattered. It would be unusual to use the Tube map as a general navigational aid but it has useful applications for travel on the London Underground.

The Tube map has undergone various evolutions, however the 1931 edition would still be an adequate guide for a journey on the Piccadilly Line from Turnpike Lane to Earls Court. By contrast a journey from Turnpike Lane station to Southwark station using the 1931 map will prove confusing since the map does not include the Jubilee Line, and Southwark station was not opened until the 1990s. A traveller using the 1931 map will not be aware that Strand station on the Northern Line was closed in the early 1970s.

Contemporary versions of the Tube map include the fare zones, which is a useful addition for journey planning. More recently editions include the Docklands Light Railway and Overground trains which extend the applications of the Tube map for journeys in the capital.

Here are two further thoughts on the accuracy of the tube map and its applications. First, when I was a schoolboy growing up in London I was amused that what appeared to me the shortest journey on the Tube map from Euston Square station to Warren Street station involved three stops and one change. I knew that in reality the stations were only less than 400 metres apart (my father was a London Taxi driver). Walking rather than taking the Tube would save both time and money.

Second, more recently I have become aware of the journey from Finchley Road tube station to Hampstead tube station which involves travelling on the Jubilee Line and making changes onto the Victoria Line and then the Northern Line. The estimated journey on the Transport for London website is about 30 minutes. Consulting a London street map reveals that the stations are less than a mile apart. A moderately fit traveller could easily walk that distance in less than half an hour. The street map (like the Tube map) is unlikely to warn the traveller that the journey is up hill however. Finchley Road underground station is 217 feet above sea level and Hampstead station is 346 feet above sea level (see here).

This preamble hopefully reinforces my opening point that all maps are inaccurate, but sometimes they have very useful applications. Some readers will know the statement made by the statistician George Box that all models are wrong but some are useful. This statement is especially helpful in reminding us that models are representations of the social world and not accurate depictions of the social world. Similarly a map is not the territory. When thinking about samples of social science data I find the analogy with maps useful as a heuristic device.

All samples of social science data are inaccurate, especially those that are either small or have been selected unsystematically. Some samples are both small and unsystematically selected. Small sample and unsystematic samples may prove useful in some circumstances but their design places limitations on how accurately the data represents the population being studied. Large-scale samples that are selected systematically will tend to be more accurate and better represent target populations. The usefulness of any sample of social science data, much like a map, will depend on its use (e.g. the research question that is being addressed).

Some large-scale social surveys use simple statistical techniques to select participants. The data within these surveys can be analysed relatively straightforwardly. Many more contemporary large-scale social surveys have complex designs and use more sophisticated statistical techniques to select participants. The motivation is usually to better represent the target population, to minimise the costs of data collection, and to allow meaningful analyses of subpopulations (or smaller groups).These are positive features but they come at the cost of making the data from complex surveys more difficult to analyse.

It is possible to approach the analysis of data from complex social surveys naively and treat them as if they were produced by a simple design and selection strategy. For some analyses this will be an adequate approach. This is analogous to using a suboptimal map but still being able to arrive close enough to your desired destination.

For other studies a naïve approach to analysis will be inappropriate. Comparing naïve results with results from more sophisticated analysis can help us to assess the appropriateness of naïve approaches. The difficulty is that reliable statements cannot easily be made a priori on the appropriateness of naïve approaches. To draw further on the map analogy, when using an inadequate map it is difficult to assess how close you get to the correct destination unless you have previously visited that location.

The benefit of social surveys with complex designs is that they have complex designs. The drawback of social surveys with complex designs is that they have complex designs. All maps are inaccurate but some have very useful applications. All samples of social science data are inaccurate but some have very useful applications. The consideration of the usefulness of a set of social science data requires serious methodological thought and this will most probably be best supported by exploratory investigations and sensitivity analyses.

To learn more about analysing data from both non-complex and complex social surveys come to grad school at the University of Edinburgh (http://www.sps.ed.ac.uk/gradschool).

2 thoughts on “All maps are inaccurate but some have very useful applications: Thoughts on Complex Social Surveys

Leave a comment