Creating the spatial lag (spmat lag) of a variable with missing values

Lucie Piaser

Join Date: Mar 2018

Posts: 9
#1

Creating the spatial lag (spmat lag) of a variable with missing values

07 Dec 2018, 03:10

Hello everyone,

I'm trying to create the spatial lag of a variable. This variable is the amount of public transfers observed for the 2458 mexican municipalities. Unfortunatly, the data are missing for 234 municipalities.
To do so, I first created the contiguity matrix with the command:

Code:

spmat contiguity matrixNr using mexico_mun_xy, id(id) normalize(row)

. I get a 2458x2458 matrix.
I then used

Code:

spmat lag double transfer_lag matrixNr transfer

. The new variable generated only has missing values. I assumed this is due to the fact that 234 observations are missing.

My question: Is there any way Stata can handle missing values while creating a spatial lag ?

This only solution I found is to create a new 2224x2224 contiguity matrix (deleting the municipalities for which the data where missing) and deleting the 234 missings in my public transfer database. Then I can get the spatial lag. However, it doesn't seem really correct to me to proceed this way and I'm not sure the data generated are valid.

I hope someone can help me.
Thank you !

Lucie
Tags: None
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#2

07 Dec 2018, 11:42

Hi Lucie,
A spatial lag is a function of the same variable for all contiguous municipalities. If there is a missing value for one of these municipalities, then the spatial lag must be missing as well because it is unclear how to treat the missing observation otherwise. Dropping all those municipalities with missing observations is one way to deal with it although this might lead to selection problems. Alternatively, you might want to think about an interpolation strategy to replace the missing values.

https://www.kripfganz.de/stata/
Comment
Lucie Piaser

Join Date: Mar 2018

Posts: 9
#3

10 Dec 2018, 04:22

Hi Sebastian,

Thanks for your reply ! I'm still a little bit curious why Stata refuses to generate any of the spatial lag. It could calculate the spatial lags for municipalities whose variable is reported for all neighboring municipalities, but it doesn't ...

I also tried to attribute to any missing value, the average amount of public transfers from the neigboring municipalities, which is basically what Stata is doing while creating a spatial lag. Then for the missing values, the amount of public transfers and its spatial lag are equal. This method seems appropriate when only a few observations are missing, but I was wondering if it was still the case with 234 missings...
Comment

Announcement

Creating the spatial lag (spmat lag) of a variable with missing values

Comment

Comment