Chapter 8 Multivariate Data Exploration
In this chapter, I switch to a full multivariate exploration of spatial data, where the focus is on the potential interaction among multiple variables. This is distinct from a univariate (or even bivariate) analysis of multiple variables, where the properties of a distribution are considered in isolation. The goal is to discover potential pathways of interaction. For example, in a bivariate analysis, one may have found a strong correlation between lung cancer and socio-economic factors, but after controlling for smoking behavior (itself strongly correlated with SES), this relationship disappears.
The methods considered share the same objective, i.e., how to represent relationships in higher dimensions on a two-dimensional screen (or piece of paper). Three techniques, the bubble chart, three-dimensional scatter plot and conditional plots are limited to the analysis of three to four variables. True multivariate analyses for several variables (more than four) can be carried out by means of the parallel coordinate plot (PCP). Each is considered in turn. I continue to use the Oaxaca Development data set for illustrations.
As in the previous chapter, the methods covered are inherently non-spatial, but by means of linking and brushing with one of more maps, they can be spatialized.
Before focusing on the specific techniques, some special features of multivariate analysis are outlined in a brief discussion of the curse of dimensionality.