Menu Close

Analysis of spatial relationships in regression residuals

In this blog post, I will discuss how to analyze the spatial relationships in regression residuals, which are the differences between the observed and predicted values of a dependent variable. Analysis of Spatial relationships in residuals can indicate whether the regression model adequately captures the spatial variation of the data, or whether there are spatial patterns that are not explained by the model.

There are different methods to assess the spatial relationships in residuals, such as visual inspection, spatial autocorrelation tests, and geographically weighted regression. I will briefly describe each of these methods and provide some examples using R code.

Visual inspection is the simplest way to examine the spatial relationships in residuals. It involves plotting the residuals on a map and looking for any spatial patterns, such as clusters, trends, or outliers. For example, suppose we have a regression model that predicts the average income of counties in the US based on some demographic and economic variables. We can plot the residuals on a map using the tmap package in R:

The resulting map shows that there are some counties with positive residuals (red), meaning that their actual income is higher than what the model predicts, and some counties with negative residuals (blue), meaning that their actual income is lower than what the model predicts. We can also see that there are some spatial patterns in the residuals, such as clusters of positive residuals in the Northeast and negative residuals in the South. This suggests that there are some regional factors that affect the income of counties that are not captured by the regression model.

Spatial autocorrelation tests are another way to analyze the spatial relationships in residuals. These tests measure whether the residuals are randomly distributed across space, or whether they are correlated with their neighboring residuals. A common test for spatial autocorrelation is Moran’s I, which ranges from -1 to 1. A positive Moran’s I indicates positive spatial autocorrelation, meaning that nearby residuals tend to have similar values. A negative Moran’s I indicates negative spatial autocorrelation, meaning that nearby residuals tend to have opposite values. A zero Moran’s I indicates no spatial autocorrelation, meaning that the residuals are independent of each other.

To perform a Moran’s I test on the residuals of our regression model, we need to define a spatial weights matrix that specifies how each county is connected to its neighbors. There are different ways to construct a spatial weights matrix, such as using distance thresholds or contiguity criteria. Here, I will use a queen contiguity criterion, which means that two counties are considered neighbors if they share a common border or vertex. We can use the spdep package in R to create a spatial weights matrix and perform a Moran’s I test:

We like your comment

%d bloggers like this: