Classifying map information|
Chapter 2: What Crime Maps Do and How They Do It
Generally, information on maps is classified in some way; data are not symbolized individually. For example, all burglaries are shown with the same symbol on a point map. It would be absurd to show each crime with its own symbol.
In effect, maps contain two levels of abstraction:
To some extent, the choice of scale controls the level of abstraction of the content because it is impractical to load a small-scale (large area) map with local detail. MacEachren (1994, p. 41) argues that for categorical information, "features that end up in the same category should be more similar to one another than features in
- The overall level of detail and the scale used to present the data.
- The way data are symbolized, because there is a continuum from highly detailed to extremely generalized in the symbolization process.
What does this mean for crime maps? A map of drug offenses might group related drug categories together. Generically related robberies could be put together in the same category and symbolized the same way on a general crime or violent crime map. If a map were specific to robberies, however, symbolization might be separated into commercial and street or weapon type or time of day. This type of adjustment is intuitive and naturally occurs in the crime context where data are typically sorted into categories as part of normal processing. But the situation becomes more intricate when moving from nominal and ordinal data to ratio-type data.
It is less obvious how to classify numerical data when several alternatives present themselves. Common mapping software packages offer options, including a default, for grouping numerical data in thematic maps but rarely explain how to choose among these approaches. Dividing up the data range in a way that best represents
it involves the abstraction issue again.
Total abstraction would be represented by the use of one shade for all areas on the map. This says that there are data, but little or no specific information is supplied about them. At the other extreme, each area would have its own shade, and if city blocks were shown, the map would have thousands of shades. Obviously, neither of these alternatives is useful, and the solution lies somewhere between.
Greater accuracy dictates the use of more classes of data, although readers pay a price for this in terms of comprehension as the map moves along the continuum of abstraction toward reality and complexity. The underlying question is, What is this map being used for? MacEachren (1994, pp. 42-43) suggests that if we are in the visual thinking stages of exploration and confirmation, we will need more detail (more classes), but as we progress toward synthesis and presentation it becomes more important to show general trends rather than detail, hence fewer classes. Furthermore, limitations on human visual comprehension must also be taken into account-the limit is about six levels of color or gray scale shading in the context of a map.
Are there natural breakpoints in crime data? For example, in a robbery map of a city we could embed the State, regional, and national robbery rates as breakpoints. This might be informative but could get a political "thumbs down" if the local jurisdiction compares unfavorably. (Conversely, it could be a popular approach.) Choices available to cartographers in common desktop mapping packages are represented by the drop-down menus shown in figure 2.14.
The choices available, and the relative ease of using them, invite experimentation. How will a particular database look when mapped in a particular way? What method conveys the crucial information with the least distortion and best visual impact? Good maps are likely to result from a working environment that encourages experiment because it is ultimately through trial and error that most learning is done. This is said not to invite a "shotgun" approach but, rather, to encourage the responsible testing of options under the assumption that alternative methods of representation are tested for a reason other than the sake of doing something different.
How Many Classes in a Map?|
Use no more than sixand not less than fourclasses, or shading levels, of data in a choropleth map.
Each of the alternatives typically employed in data mapping is introduced here and illustrated in figures 2.15 and 2.16.
Table 2.1 summarizes the criteria for selecting methods to define class intervals for maps, providing a guide with respect to data distribution, ease of understanding, ease of computation, and other standards. (For a comprehensive discussion of issues relating to the determination of class intervals for maps, see Slocum, 1999, chapter 4.)
- Equal ranges or intervals. The data range (difference between maximum and minimum) is calculated and divided into equal increments so that the within-class ranges are the same, such as 1-3, 4-6, 7-9, and so on.
- Equal count (quantiles). Approximately the same number of observations is put in each class. The number of classes determines the technical definition of the map (quartile if there are four classes, quintile if there are five classes, and so forth). The term quantile is the generic label for data with observations divided into equal groups. This software option gives the user the opportunity to enter the number of classes desired. (This is the default in MapInfo®.)
- Equal area. Breakpoints between classes are based on equality of area rather than equality of range or an observation count. If areas in a choropleth map vary greatly in size, this type of map will differ from an equal count map based on the same data. If areas are roughly equal in size (such as city blocks), the result will be similar to an equal count presentation.
- Natural breaks. In this approach, gaps or depressions in the frequency distribution are used to establish boundaries between classes. This is the default in ArcView®, which employs a procedure know as Jenks' Optimization that ensures the internal homogeneity within classes while maintaining the heterogeneity among the classes. (For more details, see Dent, 1990, pp. 163-165, and Slocum, 1999,
- Standard deviation (SD). SD is a statistical measure of the spread of data around the mean, or average. In the literature of stocks and mutual funds, for example, SD is often used as a risk index, since it expresses the amount of price fluctuation over time. In the context of crime, SD can be a useful way of expressing extreme values of crime occurrence or portraying various social indicators. Generally, classes are defined above and below the average in units of 1 SD. The drawback is that this method assumes an underlying normal distribution, or bell-shaped curve, something of a rarity in social data.
- Custom. As the label suggests, this option allows users to determine class intervals according to their own criteria, such as regional or national norms and thresholds determined for policy reasons.