Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Spatial regression results sensitive to normalization of weighting matrix

    Hi Statalist,

    I’m new to spatial regression and am using the xsmle command to regress sex ratios at birth across Indian states. My goal is to measure cultural spillover effects regarding son preference and sex selection throughout the country.

    I’m using both SDM and SAR models with spatial lag of y, temporal lag of y, and spatial-temporal lag of y. I began with a simple inverse distance weighting matrix, which seemed to work just fine. When I instead create my own weighting matrix, the regression results vary wildly depending on whether and how I normalize the weights matrix (row normalized versus spectral versus not normalized). The weights in the matrix take the form w_ij = [Pop_i^a * Pop_j^b ] / Dist_ij^c, where Pop_i is population of state i, Pop_j is population of state j, Dist_ij is distance between the centroids of states i and j based on latitude and longitude coordinates, and the parameters a, b, and c are estimated using a gravity equation such that w_ij is migrants coming from state j to state i. I am using the ppml command to estimate the gravity equation, then spmat to create the matrix. Unlike the idistance matrix, this one is asymmetric since migration from state i to state j will differ from migration from state j to state i, and so a and b are not equal.

    Specifically, the problem is that depending on whether and how I normalize my weights matrix, the regression produces autoregressive parameter estimates that raise red flags:

    1) When I normalize the weight matrix, the parameter on the spatial lag is much larger than 1 in the SDM but not SAR model. Why/how can the spatial lag parameter be greater than 1? I don’t believe that there is an explosive process here, and this outcomes is also inconsistent. When I don't normalize, the parameter is less than 1, unless I include a state-time trend into the mode, so again, not consistent results.

    2) Secondly, the parameters on the temporal lag and the spatial-temporal lag switch between positive and negative depending on matrix normalization. While it is theoretically possible that these lags would be negative, the sensitivity of these outcomes to the weighting matrix is concerning.

    I really appreciate any insights on this!

  • #2
    1) The coefficient of the spatially lagged dependent variable should not exceed 1 if it is normalized appropriately. I encountered a similar strange behavior with xsmle before in a model that has both a spatial lag and a time lag. The authors of the command confirmed that there was a bug in their program for this case. While they have identified the bug, as far as I know they unfortunately did not release an update (yet). You might want to get in touch with them regarding this matter. (In the meantime, feel free to send me a private message or e-mail regarding a workaround. I have written a preliminary command that should work correctly for these models, but it is not yet ready for public release.) Note, however, that the applied bias correction procedure for these models can in some situations still yield a slightly larger estimate than 1 even after the bug is fixed.

    2) As a general comment on spatial weights matrix normalization: Unless there is a compelling theoretical reason for a row standardization, I recommend to always use the spectral standardization instead. A row standardization alters the nature of the underlying spatial network with possibly undesired consequences. The spectral standardization merely scales all elements of the weights matrix and the corresponding coefficient accordingly, without changing the network structure.
    https://www.kripfganz.de/stata/

    Comment

    Working...
    X