Chapter 7 Univariate and Bivariate Data Exploration
In this and the next two chapters, I continue the review of data exploration, but now shift the focus to traditional non-spatial EDA methods. Through the use of linking and brushing, the connection with a spatial representation (a map) can always be made explicit. This idea of spatializing EDA is central to the perspective taken here.
In the current chapter, I focus on techniques to describe the distribution of one variable at a time (univariate), and on the relationship between two variables (bivariate) through standard statistical graphs. These include the histogram, box plot, scatter plot and scatter plot matrix, considered in turn. The chapter closes with a discussion of spatial heterogeneity, both for a single variable (through the averages chart) and pertaining to the relationship between two variables (brushing the scatter plot).
To illustrate the methods covered in this and the next two chapters, I introduce a new sample data set with poverty indicators and census data for 570 municipalities in the state of Oaxaca in Mexico. The poverty indicators are from CONEVAL (the National Council for the Evaluation of Social Development Policy) for 2010 and 2020. The census variables cover 2000, 2010 and 2020 and are from INEGI (National Institute of Statistics and Geography).
The data are contained in the Oaxaca Development sample data set.