18.2 The Multivariate Spatial Autocorrelation Problem
Designing a spatial autocorrelation statistic in a multivariate setting is fraught with difficulty. The most common statistic, Moran’s I, is based on a cross-product association, which is in the same spirit as a bivariate correlation statistic. As a result, it is difficult to disentangle whether the correlation between multiple variables at adjoining locations is due to the correlation among the variables in-situ, or a similarity due to being neighbors in space.
Early attempts at extending Moran’s I to multiple variables focused on principal components, as in the suggestion by Wartenberg (1985), and later work by Dray, Saïd, and Débias (2008). However, these proposals only dealt with a global statistic. A more local perspective along the same lines is presented in Lin (2020), although it is primarily a special case of a geographically weighted regression, or GWR (Fotheringham, Brunsdon, and Charlton 2002).
S.-I. Lee (2001) outlined a way to separate a bivariate Moran-like spatial correlation coefficient into a spatial part and a Pearson correlation coefficient. However, this approach relies on some fairly strong simplifying assumptions that may not be realistic in practice.
An alternative perspective is offered In Anselin (2019a). The central idea underlying this approach is to focus on the distance between observations in both attribute and geographical space. A multivariate local spatial autocorrelation statistic then assesses the match between those two concepts of distance.
More formally, the squared multi-attribute Euclidean distance between a pair of observations \(i, j\) on \(k\) variables is given as: \[d_{ij}^2 = || x_i - x_j || = \sum_{h=1}^k (x_{ih} - x_{jh})^2,\] with \(x_i\) and \(x_j\) as vectors of observations. In some expressions, the squared distance will be preferred, in others, the actual distance (\(d_{ij}\), its square root) will be used.
In this approach, the overarching objective is to identify observations that are close in both multi-attribute space and in geographical space, i.e., those pairs of observations where the two types of distances match.