Replicating Spatial Panel Regression (SDM) from R in Stata

Nate Tillern

Join Date: Jun 2017

Posts: 32
#1

Replicating Spatial Panel Regression (SDM) from R in Stata

12 Jan 2022, 07:17

I was running some SDM (and SAR for robustness) regressions in R using the SPLM package. To correctly interpret coefficients I need to calculate direct/indirect impacts which the SPLM package can no longer do due to changes in the SPDEP package which it is built off of. XSMLE in Stata can also run SDM regressions and it can also still calculate impacts so I have attempted to replicate my regressions from R in Stata. While I expected them to not be identical due to the software switch, they are currently so wildly different (huge differences in size, sign, and significance of coefficient estimates) that I clearly have missed something, but I have been unable to identify what it is.

I have looked at the weight matrices from both Stata and R and they appear identical (compared their GAL text files side by side). According to documentation both SPLM and XSMLE should be spatially lagging variables in the same manner (multiplying the Nx1 variable vector by the NxN weight matrix) and both should be using maximum likelihood (ML) estimation for their regressions.

In R I was effectively running:

Code:

Model <- DepVar ~ ListOfExplanatoryVars + ListOfSpatiallyLaggedExplanatoryVars reg <- spml(Model, DataSet, index = NULL, listw = WeightMatrix, lag = T, spatial.error = "none", model = "within", effect = "twoways", zero.policy = TRUE)

Where 'lag = T, spatial.error = "none"' and the spatially lagged explanatory variables specifies a SDM model. And 'model = "within", effect = "twoways"' should specify a fixed effects model with both time and individual fixed effects.

In Stata I have attempted to replicate this with:

Code:

xsmle DepVar ListOfExplanatoryVars, wmat(WeightMatrix) model(SDM) fe type(both) eff vcee(sim, nsim(99))

Where 'model(sdm)' tells Stata to spatially lag the explanatory variables and control for them along with spatially lagging the dependent variable and not the errors to make it SDM. 'fe type(both)' should specify a fixed effects model with two-way fixed effects. 'eff vcee(sim, nsim(99))' should only relate to the calculation of direct/indirect effects and not impact the coefficient estimates.

I am hoping someone with more experience with XSMLE can identify something I have missed. Or perhaps someone who has experience replicating R results in Stata could provide some tips. I am sure that I need to provide more details but I wanted to avoid writing too much and including inconsequential information and can happily add whatever information would help.
Tags: panel data, R Replication, spatial durbin model, Spatial Panel Analysis, XSMLE
Felix Dornseifer

Join Date: Jan 2021

Posts: 3
#2

12 Jan 2023, 10:21

Hi Nate Tillern,
I'm sorry to see that nobody replied to your question. I have a similar problem, trying to replicate code that I wrote in Stata in R. In Stata I use spxtregress to estimate the following spatial model:

Code:

spxtregress tbranches ftshare $tcontrols $ccontrols, fe dvarlag(W) ivarlag(W:ftshare $tcontrols) errorlag(W)

W is my weighting matrix and I include spatially lagged variables on US census tract level as well as my independent variable ftshare. Similar to you case results in R are widely diffferent in size, significance and sign. Weighting matrices are not identical but highly similar.
This is my code in R:

Code:

model<- spml(spml_model_eq, data=master_illinois, index=c("fipstract", "year"), listw=queen_illinois, model="within", effect="twoways", lag=TRUE, spatial.error="b")

where the model equation includes the same variables as the model in Stata and is given by:

Code:

spml_model_eq<-tbranches~ftshare +log_cgdp + bbhhs + log_pop_t + hhi_tdep + log_acs_mhv_t + pct_vacancies_t + log_hinc_t + pct_elderly_t + pct_unempl_t + pct_bachelor_t + pct_white_t + pct_female_t

I know that the model in R includes also the spatial lags from my measure of gdp and broadband(bbhhs) but I doubt that this leads to such big differences in results. Do you have any idea how to replicate the code or ar you willing to share if and how you solved your issue? Although your Stata code is different, it might point me towards the main differences.
Comment

Announcement

Replicating Spatial Panel Regression (SDM) from R in Stata

Comment