Dear Statalist members,
I would like to introduce spsiv, a new Stata command for generating synthetic instrument variables (SIV) used in spatial regression models with endogenous variables.
This command implements the aggregated IV method of Le Gallo & Paez (2013) and Fingleton (2023), providing instruments strongly correlated with endogenous regression variables while still meeting standard IV requirements. spsiv supports both cross-sectional and panel data setups and can be used in conjunction with commands such as spivreg, spivregress, xtdpd, and xtabond2.
Furthermore, SIV can also be used with conventional endogenous regressions, provided a given spatial correlation scheme exists involving the endogenous variable. This specification has also been used in Fingleton (2023).
Thanks to Prof. Kit Baum, the command is already available on SSC and can be installed by running the command:
All comments, suggestions, and bug reports are welcome.
References:
And the results:
I would like to introduce spsiv, a new Stata command for generating synthetic instrument variables (SIV) used in spatial regression models with endogenous variables.
This command implements the aggregated IV method of Le Gallo & Paez (2013) and Fingleton (2023), providing instruments strongly correlated with endogenous regression variables while still meeting standard IV requirements. spsiv supports both cross-sectional and panel data setups and can be used in conjunction with commands such as spivreg, spivregress, xtdpd, and xtabond2.
Furthermore, SIV can also be used with conventional endogenous regressions, provided a given spatial correlation scheme exists involving the endogenous variable. This specification has also been used in Fingleton (2023).
Thanks to Prof. Kit Baum, the command is already available on SSC and can be installed by running the command:
Code:
ssc install spsiv
References:
- Fingleton, B. (2022). Estimating dynamic spatial panel data models with endogenous regressors using synthetic instruments. Journal of Geographical Systems, 25, Article 1. https://doi.org/10.1007/s10109-022-00397-3
- Le Gallo, J., & Páez, A. (2013). Using synthetic variables in instrumental variable estimation of spatial series models. Environment and Planning A, 45(9), 2227-2242.
Code:
* Cross-sectional data
copy https://www.stata-press.com/data/r19/homicide1990.dta ., replace
copy https://www.stata-press.com/data/r19/homicide1990_shp.dta ., replace
use homicide1990, clear
spset
spmat idistance m _CX _CY, id(_ID) dfunction(dhaversine) replace
spsiv ln_population ln_pdensity gini, m(m) a(0.1)
* Panel data
copy https://www.stata-press.com/data/r19/homicide_1960_1990.dta ., replace
copy https://www.stata-press.com/data/r19/homicide_1960_1990_shp.dta . , replace
use homicide_1960_1990, clear
xtset _ID year
spset
preserve
keep if year==1990
spmat idistance m _CX _CY, id(_ID) dfunction(dhaversine) replace
restore
spsiv ln_population ln_pdensity gini if year==1990, m(m) a(0.1)
spsiv ln_population ln_pdensity gini, m(m) a(0.1)
Code:
. use homicide1990, clear
(S.Messner et al.(2000), U.S southern county homicide rates in 1990)
. spset
Sp dataset: homicide1990.dta
Linked shapefile: homicide1990_shp.dta
Data: Cross sectional
Spatial-unit ID: _ID
Coordinates: _CX, _CY (planar)
. spmat idistance m _CX _CY, id(_ID) dfunction(dhaversine) replace
. spsiv ln_population ln_pdensity gini, m(m) a(0.1)
(S.Messner et al.(2000), U.S southern county homicide rates in 1990)
Correlation between X and synthetic intrumental variables
------------------------------------------------
Variable (X)ln_populationln_pdensity gini
------------------------------------------------
Correlation 0.7498 0.7833 0.7985
------------------------------------------------
. copy https://www.stata-press.com/data/r19/homicide_1960_1990.dta ., replace
(file homicide_1960_1990.dta not found)
. copy https://www.stata-press.com/data/r19/homicide_1960_1990_shp.dta . , replace
(file homicide_1960_1990_shp.dta not found)
. use homicide_1960_1990, clear
(S.Messner et al.(2000), U.S southern county homicide rate in 1960-1990)
. xtset _ID year
Panel variable: _ID (strongly balanced)
Time variable: year, 1960 to 1990, but with gaps
Delta: 1 unit
. spset
Sp dataset: homicide_1960_1990.dta
Linked shapefile: homicide_1960_1990_shp.dta
Data: Panel
Spatial-unit ID: _ID
Time ID: year (see xtset)
Coordinates: _CX, _CY (planar)
. preserve
. keep if year==1990
(4,236 observations deleted)
. spmat idistance m _CX _CY, id(_ID) dfunction(dhaversine) replace
. restore
. spsiv ln_population ln_pdensity gini if year==1990, m(m) a(0.1)
(S.Messner et al.(2000), U.S southern county homicide rate in 1960-1990)
Correlation between X and synthetic intrumental variables
------------------------------------------------
Variable (X)ln_populationln_pdensity gini
------------------------------------------------
Correlation 0.7498 0.7833 0.7985
------------------------------------------------
. spsiv ln_population ln_pdensity gini, m(m) a(0.1)
(S.Messner et al.(2000), U.S southern county homicide rate in 1960-1990)
Correlation between X and synthetic intrumental variables
------------------------------------------------
Variable (X)ln_populationln_pdensity gini
------------------------------------------------
Correlation 0.7315 0.7789 0.8418
------------------------------------------------
