Monthly Archives: April 2016

Analyses can only be as good as the measures which underlie them

Roxanne Connelly, University of Warwick

Quantitative sociological research hinges on the collection of data in the form of measured variables, and its summary through statistical analysis of the ‘relationships between variables’ (e.g. Marsh, 1982). In the last decades, methodological innovations and analysis options in quantitative research have rapidly developed, alongside increasing computer power and software capabilities for the sophisticated analysis of the large volumes of micro-data which we now have at our disposal. These methodological advances in social survey data analysis are well documented, and social researchers are becoming increasingly able to deploy relatively complex and specialised statistical modelling techniques. Yet the results of analyses can only be as good as the measures which underlie them. Whilst it is normal for most researchers to have a good justification for the way in which the variables most central to their analysis are operationalised, there are certain ‘key variables’ – measures within quantitative research that are routinely recorded and feature in a great many analyses, whether as explanatory or outcome variables – for which measurement and operationalisation is sometimes only briefly considered (and often inappropriately simplified). Indeed, from the 1950s to the present day, social survey methodologists have heralded the same warning on several occasions – that the construction and careful analysis of such ‘key variables’ has habitually been overlooked in literature and practice (Blumer, 1956, Bulmer, et al., 2012, Burgess, 1986, Stacey, 1969).

I recently published a series of papers on this issue with Professor Vernon Gayle and Professor Paul Lambert. The papers provide an overview of the measurement options available for the analysis of three ‘key variables’, namely measures based upon occupation, education and ethnicity. There are, of course, many more variables (e.g. gender, age, health, wellbeing, religiosity) which could be considered in detail. The three variables chosen as the focus of these papers are utilised very widely in quantitative research either as explanatory or dependent variables, they are also variables for which a range of measurement options are available. Furthermore, there is a degree of debate over how these three variables should be operationalised and the complexities of the use of these variables are often overlooked in practice. These papers build on the reviews of Stacey (1969), Burgess (1986) and the more recent contribution of Bulmer et al. (2010) and discusses contemporary approaches and issues in the construction and modelling of these measures.

One of the issues which we sought to emphasise is that the manner in which a variable is constructed relies upon the decisions of the analyst and subsequently influences the form and outcomes of statistical models. The best research publications ought to show evidence of evaluation of alternative measures and careful documentation of the route taken, which can easily be made available to the reader through electronic sources (as argued by Dale, 2006). This is especially important in areas of the social sciences where there are many and, often disputed, measurement alternatives, thus leading to complex possibilities for the construction of variables. However, this practice is rarely carried out and the measures used in quantitative sociological research are neglected in discussions and can be poorly described.

It is widely noted that the data preparation and variable construction stage of the research process is the most time consuming. Methodologists generally recommend that researchers should take their time in constructing measures from a dataset in a clear, assiduous manner with every operation carefully documented through well annotated software command files (e.g. Long, 2009). If this is achieved, a clear trace of the variable construction process is developed which is readily replicable in the future, and after which the statistical analysis stage of the research can usually progress relatively swiftly. A common complaint, however, concerning social science research projects, is that the activities of variable construction are often neither well documented nor replicable by others (e.g. Treiman, 2009). This typically arises for two reasons. The first is the sub-optimal exploitation of software (for instance, due to researchers not using command files at all, or using them in a poorly organised sequence). This poor practice arguably represents long-term shortcomings in the training and information organisation skills of survey researchers (e.g. Long, 2009). The second issue is researchers’ lack of awareness (or at a minimum, their lack of inclination) to seek out, engage with, and ideally re-use, existing approaches to variable constructions. Researchers frequently invent new variable constructions ‘on the fly’ during the research process, in a manner which makes documentation and replication very difficult (see Lambert, et al., 2007).

There ought to be good news with regards to variable construction in quantitative research, insofar as many social scientists have already put a great deal of effort into the production of carefully constructed and tested measures. In most situations there are a range of suitable pre-existing variable constructions to choose from, and this is particularly true of ‘key measures’ in the social sciences. Throughout this series of papers we argue that clear documentation plays a central role in high quality social research, and provides a solid basis for replication and the incremental development of our knowledge base.

References

 Blumer, H., 1956. Sociological analysis and the” variable”. American sociological review 21, 683-690.

Bulmer, M., Gibbs, J., Hyman, L., 2010. Social Measurement through Social Surveys: An Applied Approach. Ashgate, Farnham.

Bulmer, M., Gibbs, M.J., Hyman, L., 2012. Social Measurement through Social Surveys: An Applied Approach. Ashgate Publishing, Ltd., Farnham.

Burgess, R., 1986. Key Variables in Social Investigation. Routledge and Kegan Paul, London.

Dale, A., 2006. Quality Issues with Survey Research. International Journal of Social Research Methodology 9, 143-158.

Lambert, P., Gayle, V., Tan, L., Turner, K., Sinnott, R., Prandy, K., 2007. Data Curation Standards and Social Science Occupational Information Resources. The International Journal of Digital Curation 2, 73-91.

Long, J.S., 2009. The Workflow of Data Analysis Using Stata. Stata Press, College Station.

Marsh, C., 1982. The Survey Method: The Contribution of surveys to sociological explanation. Allen and Unwin, London.

Stacey, M., 1969. Comparability in Social Research. Heinemann, London.

Treiman, D.J., 2009. Quantitative Data Analysis: Doing Social Research to Test Ideas. John Wiley & Sons, San Francisco.