A Brief Guide to Quantitative Data Collection

GaWC logo
  Gateways into GaWC

A Brief Guide to Quantitative Data Collection at GaWC, 1997-2001

P.J. Taylor

The purpose of this short introductory note on GaWC’s data collection is to provide users with an outline of the basic arguments behind GaWC’s data mission and to describe how we got to the position of measuring the world city network in 2001. At each step:

On reading this Guide relations between different data collection exercises, and subsequent analyses, should become clear.


1. Three Classic Evidential Blemishes

The best way to begin to show the need for inter-city data is through the problems experienced by authors using the world city literature in general texts. I highlight one small part of three very important books that discuss cities and globalization as part of a broader argument. Each selection consists of a quite surprising evidential blemish. These otherwise astute commentators on things global are each ‘let down’ when they come to use the world city literature to illustrate a geography of globalization.

  • Manuel Castells’ (1996 and 2000(2nd ed)) Network Society is possibly the most influential text for providing a spatial framework for world city studies. Following Sassen’s (1991 and 2001(2nd ed)) charactization of global cities as advanced financial and business service centers, Castells’ (1996, 415) describes world cities as the ‘most direct illustration’ of worldwide nodes and hubs in his space of flows. Although it was not part of Castells' (1996) brief to engage in new data generation, his prime use of data to show his space of flows is incredibly broad grained. The evidence he provides on these nodes and hubs is a set of worldwide information from Federal Express originally analyzed by Michelson & Wheeler (1994, 382-3) that consists of one origin (USA) and just nine destinations only 3 of which are actually cities. While we might go along with Castells’ conceptualization of a global space of flows we can but note that the evidence he marshals is mightily unimpressive.
  • Peter Dicken’s (1998 (3rd ed)) Global Shift is the most influential geographical text on contemporary globalization. The value of this key textbook lies to a large extent in its bringing together reams of evidence to describe contemporary transformations in the world economy. But not so with world cities, here he provides minimal evidence. This consists of a diagram that is intended 'to give an impression of a connected network of cities' (p. 209) because, we are told, 'the links shown are diagrammatic only'. He bases his diagram on John Friedmann’s (1986) sketch of a ‘world city hierarchy’ and produces some very odd linkages. For instance the route from Dusseldorf to London goes first to Brussels and then on to Paris before finally reaching its destination. Why there should be such a three-step connection in this electronic communication age is not explained. But remember, these are only 'diagrammatic links', again mightily unimpressive fare even for just giving an ‘impression’ of a world city network.
  • The Open University texts in the Understanding Cities series are the best textbooks available for urban studies and the key book is Unsettling Cities (Allen, Massey and Pryke, 1999). But systematically marshaling empirical evidence is not one of their fortes. The book begins with a chapter on ‘cities of connection and disconnection’ (Amin and Graham, 1999) that encourages us to think ‘relationally’ about cities but provides only evidence in the form of a ‘virtual single office’ linking together just six cities (p. 11). A later chapter by Allen (1999) provides the most thorough recent discussion of power among world cities, but backs up his argument with figures (Figures 5.3 and 5.5) that are again drawn from Friedmann (1986). The bizarre outcome is that this discussion of power excludes some of the most rapidly globalizing cities of the 1990s – Moscow, Beijing, Shanghai – simply because it uses a ‘world city hierarchy’ (purportedly) describing the situation before the end of the Cold War. Once again, with respect to evidence, this book is mightily unimpressive.

Conclusion: even in the best books, there seems to be an inherent problem with saying something soundly empirical about inter-city relations at the global scale (Taylor, 1999) (see RB1, RB10).

2. The Culprit: Stat-istics (see PR1; RB1, RB2, RB10, RB15, RB29, RB33)

The common term for social data is ‘statistics’ a term that derives directly from the word state. This is, of course, no accident: large-scale data collection on human activities has its origins in state needs and continues to be dominated by states: hence my portrayal of it as state-istics.

Unlike the natural sciences, within social science there is little or no ‘big science’ where very large sums of money are committed to solving theoretical problems. The latter enables natural scientists to concentrate on developing measurements specifically designed for their theoretical purposes. In social science, most data that is collected relates to small-scale cumulative scientific activity. To get an evidential handle on big issues, researchers normally rely on the statistics that are available, that is to say, already collected. Collection is carried out usually by a state agency for the particular needs of government policy, not, of course, for social science research. But the problem is much more than the possibility of having to use unsuitable data. Basing ‘big social science’ on state-istics means that the state defines the basic dimensions of the leading edge ‘macro’ social research and therefore the framework within which most social research is conducted. This embedded statism within most very large-scale social data sets is a major reason why the information we want for describing inter-city relations is not available. Three characteristics of urban studies stem from embedded statism.

  • First, there is the dominance of attribute measures over relational measures in social research. Measurement can take one of two forms: attribute measures on objects or relational measures between objects. The needs of social science and the state diverge at this very starting point. All theory about human social activities is basically about relations between individuals, groups and other human collectivities. Therefore the data need is for relational measures, of flows, connections, linkages and other less tangible relations. The prime concern of the state for data has always been accounting, taking stock, finding out numbers of phenomenon within its territory or parts thereof. Thus the vast majority of statistics are lists of attributes by place as any quick browse through a census volume will confirm. Cities are the most important places in which census counts are made, aggregated, and reported. In this process cities are effectively de-networked: they are stat-istically treated as a bounded sub-territory when the essence of all cities is their unbounded connections to other cities.
  • Second, there is no transnational scale in stat-istics. There are international statistics compiled by the UN from national censuses and which therefore project the attribute proclivity to the larger scale. These data are largely used for comparisons between countries but there are some important relational data sets at this scale because the states have an interest in traverses across their boundaries. However such trade and migration statistics totally neglect cities. Thus, we can find information on certain relations between, say France and the UK, but there is nothing on relations between London and Paris, or between Manchester and Lyon. This information default is made clear by thinking of ‘Main Street, World-Economy’, the multifarious connections linking London and New York. This massive concentration of flows of information across the North Atlantic is simple not picked up in any ‘official statistics’. There is no state that needs such information and therefore there is no publicly available information about the most important inter-city relation – the most significant geographical connection - in the world today.
  • Third, there is a great temptation to interpret rankings as hierarchies. Since data can be compiled from official statistics on cities to provide quantities of attributes – population totals, employment sector totals, headquarter totals, etc. – cities can be ordered by size in various ways that may look like an urban hierarchy. Of course, it is no such thing: hierarchies can only be defined as relations between objects, mere ranking of cities says nothing about relations between cities. But since this is the only type of evidence available, such rankings have bolstered the original Friedmann hypothesis that world cities form a hierarchy. In a sense it is the easy way out of the data problem: attribute evidence is combined with a simply transfer of the traditional ‘national city hierarchies’ model up a scale to create a world city hierarchy. The widespread acceptance of such a patterning of world cities is a classic example of how misguided thinking can fill a huge evidential gap. Obviously rankings by attribute, with or without a hierarchical model, are no solution to the lack of information on inter-city relations.

In conclusion: for the large-scale study of the inter-relations between world cities there is no alternative but to generate your own data.


The Globalization and World Cities (GaWC) Study Group and Network was set up initially to contribute to solving the world city data problem. There are two strands to this work: qualitative studies that focus on a small number of cities to assess their relations (i.e. not to compare attributes of cities), and quantitative studies that attempt to measure the whole network (i.e. a global urban analysis). Here I focus on the latter. Normally this is presented as a logical progression from model to measurement but in practice it was an iterative, trial and error, process starting from quite modest beginnings. There are four main stages to reaching the point where we can measure the world city network.

  • Initial fumblings involved experiments with several methodologies detailed in Beaverstock et al. (2000). Initially the most promising methodology for producing large-scale quantitative relational data for cities involved place content analysis of the financial pages of city newspapers (Taylor, 1997). Although having the advantage of sources that could produce time series data, in fact the method produced rather general data that was too event orientated. Understanding world city network formation required measures that covered the everyday reproduction of the network. (see PR3; RB1, RB2)
  • The London project had the task of simply finding out the business connections between London and other world cities. Using Sassen’s identification of advanced producer services as critical to identifying the ‘global city’, we selected London-located service firms and used their web sites to record other cities in which they had offices. An initial problem was simply to decide with which cities we would trace London’s connections. Since there were no rosters of world cities in the literature we had to produce our own. This was done using attribute measures of numbers, sizes and functions of offices within cities. A roster of 55 cities was created (Beaverstock et al., 1999) and the distribution of 72 London-located service firms across these cities was recorded. From this data we were able to compute measures of service office connections between London and the other 54 world cities. The three cities with the highest level of service connection to London were New York, Paris and Hong Kong neatly showing the city’s global range of connectivity across the three main globalization arenas (northern America, western Europe and Pacific Asia). (see PR2; DS4, DS5; RB3, RB4, RB5, RB6, RB7, RB9, RB11)
  • The experimental data set was constructed from the London project data. It was readily apparent that any business service firm with global pretensions had an office in London and therefore our ‘London-located firms’ were actually a reasonable sample of global service firms. We decided, therefore, to create a global service matrix involving the 55 world cities and the 46 London-located service firms with the widest spread of offices. Within the matrix we indicated the importance of an office on a scale from 0 (no presence) to 3 (headquarters). Thus we had a 55 (cities) x 46 (firms) matrix of office values. Each column of the matrix showed a firm’s distribution of offices across cities, each row showed a city’s service mix across firms. This enabled us to conduct the first multivariate analyses of world cities to show patterns of firms with similar global location strategies and cities with similar mixes of service firms (Taylor and Walker, 2001). (see PR7, PR8; DS6, DS7, DS8; RB13, RB16, RB17, RB21, RB25, RB30, RB31, RB34, RB35)
  • The interlocking network model was devised to make sense of the experimental data and analysis. What were we actually measuring and how did it hold together? It seemed reasonable to assume that a pair of cities sharing a lot of the same service firms had more flows than a pair of cities sharing few firms but how could this be modeled? The key point in this assumption is that it is the firms that are creating the flows and therefore it is they who define the world city network. This requires a three-level network model: as well as nodal level (city) and network level there needs to be a sub-nodal level for the firms as agents. This is precisely what the interlocking network model provides with firms in the role of ‘interlockers’ between cities to produce the connectivities that define the network. Specification of this model was the breakthrough in this research (see Taylor, 2001). There were many references to inter-city model forms in the literature – urban systems, city networks, urban hierarchies – but there was no precise specification upon which measurements could be devised. In contrast the interlocking model was specified as the service values matrix (firms x cities) from which formulae for connectivity measures were derived. (see RB23)
  • Large-scale customized data collection was carried out as required by the interlocking network model. In fact the new matrix was an improved version of the initial experimental data: much more comprehensive covering 100 firms and 315 cities. Size is important because all connectivity measures are derived as aggregates across firms to iron out individual firm idiosyncrasies. But the key point is that now we have a conceptual basis upon which to probe inter-city relations quantitatively. (see PR10, PR19; DS11; RB43, RB48, RB50, RB55, RB56, RB58, RB61, RB77, RB88, RB89, RB97)

In conclusion: these data (Taylor et al. 2002a) form the basis of all our current work on the world city network, which constitutes a first comprehensive, empirical global urban analysis (Taylor et al. 2002b). The data will be posted when analyses have been completed at GaWC; enquiries for pre-posting access should be made to crogfam@yahoo.com.


ALLEN, J. (1999) ‘Cities of power and influence’, in Allen et al. (eds) Unsettling Cities. London: Routledge

ALLEN, J., MASSEY, D. AND PRYKE, S. (eds) (1999) Unsettling Cities. London: Routledge

AMIN, A. AND GRAHAM, S. (1999) ‘Cities of connection and disconnection’, in Allen et al. (eds) Unsettling Cities. London: Routledge

BEAVERSTOCK. J.V., SMITH, R.G. AND TAYLOR, P.J. (1999) ‘A roster of world cities’, Cities, 16, 445-58

BEAVERSTOCK. J.V., SMITH, R.G., TAYLOR, P.J., WALKER, D.R.F. AND LORIMER, H. (2000) ‘Globalization and world cities: some measurement methodologies’, Applied Geography, 20, 43-63

CASTELLS, M. (1996, 2001) The Rise of the Network Society. Oxford: Blackwell

DICKEN, P. (1998) Global Shift. London: Paul Chapman

FRIEDMANN, J. (1986) ‘The world city hypothesis’, Development and Change, 7, 69-83

MICHELSON, R.L. AND WHEELER J.O. (1994) ‘The flow of information in a global economy: the role of the American urban system in 1990’, Annals of the Association of American Geographers, 84, 87-107.

SASSEN, S. (1991, 2001) The Global City. Princeton, NJ: Princeton University Press

TAYLOR, P.J. (1997) ‘Hierarchical tendencies amongst world cities: a global research proposal’, Cities, 14, 323-32

TAYLOR, P.J. (1999) ‘"So-called world cities": the evidential structure within a literature’, Environment and Planning A, 30, 1901-04

TAYLOR, P.J. (2001) ‘Specification of the world city network’, Geographical Analysis, 33, 181-94

TAYLOR, P.J. AND WALKER, D.R.F. (2001) ‘World cities: a first multivariate analysis of their service complexes’, Urban Studies, 38, 23-47

TAYLOR, P.J., CATALANO, G. AND WALKER, D.R.F. (2002a) ‘Measurement of the world city network’, Urban Studies, 39, 2367-76

TAYLOR, P.J., CATALANO, G. AND WALKER, D.R.F. (2002b) ‘Exploratory analysis of the world city network’, Urban Studies, 39, 2377-94