Hi everyone,
I am looking to examine the relationship between perceived neighbourhood cohesion (NSC_index) and life satisfaction (lfsato). In my paper I have run an OLS model (as a benchmark), FE model and then I want to run a dynamic panel model using system GMM.
I have 3 key questions regarding System GMM which I will outline below and would greatly appreciate any guidance:
I am running the system GMM model (using stata 18) as follows:
In terms of the endogenous variables specified by gmm : I have included lagged life satisfaction and then from the literature Piper (2023) states that marriage status, income and health are endogenous with life satisfaction, so I have included these as well. I also conducted a pairwise correlation matrix and VIF including all of my explanatory variables and life satisfaction and found that mental health was also quite highly correlated with life satisfaction so I have included this variable.
I have then included all of the other explanatory variables from my OLS and FE regressions as exogenous instruments specified by iv.
Q1. Is this the correct/valid way to decide which variables are endogenous/exogenous?
Running the above code in Stata generates the following output:
Q2. Does this seem correctly specified?
Q3. I’m a bit concerned that the sargan test is still significant. Should I try and reduce the number of exogenous instruments or lags in my model? Or, is there an alternative way to address this issue?
Thank you in advance for any advice or guidance you may be able to provide. I am very new to statistics and have spent a lot of time reading the documentation in stata and the empirical literature on how to best use GMM but would love some clarification on the above.
Paper cited: Piper, Alan. (2023). What Does Dynamic Panel Analysis Tell Us About Life Satisfaction?. Review of Income and Wealth. 10.1111/roiw.12567.
Many thanks,
Emma
I am looking to examine the relationship between perceived neighbourhood cohesion (NSC_index) and life satisfaction (lfsato). In my paper I have run an OLS model (as a benchmark), FE model and then I want to run a dynamic panel model using system GMM.
I have 3 key questions regarding System GMM which I will outline below and would greatly appreciate any guidance:
I am running the system GMM model (using stata 18) as follows:
Code:
xtset pipd wave xtabond2 lfsato lag_lfsato NSC_index income i.age_group_destr age2 jbstat_simple edu_simple marriage_status tenure_dummy addrmov_dummy aidhh_dummy hhsize_simple nchild_simple physical_health mental_health i.wave i.gor_dv [pweight=l_indscus_lw], gmm (lag_lfsato income marriage_status physical_health mental_health, collapse) iv( NSC_index i.age_group_destr age2 jbstat_simple edu_simple tenure_dummy addrmov_dummy aidhh_dummy hhsize_simple nchild_simple i.wave i.gor_dv) nodiffsargan robust small
I have then included all of the other explanatory variables from my OLS and FE regressions as exogenous instruments specified by iv.
Q1. Is this the correct/valid way to decide which variables are endogenous/exogenous?
Running the above code in Stata generates the following output:
Code:
. xtabond2 lfsato laglfsato3 NSC_index fihhmngrs1_dv i.age_group_destr age2 jbstat_simple edu_si > mple mastat_simple tenure_dummy addrmov_dummy aidhh_dummy hhsize_simple nchild_simple scsf1_co > mbined_r sf12mcs_dv i.wave i.gor_dv [pweight=l_indscus_lw], gmm (laglfsato3 fihhmngrs1_dv mast > at_simple scsf1_combined_r sf12mcs_dv, collapse) iv( NSC_index i.age_group_destr age2 jbstat_s > imple edu_simple tenure_dummy addrmov_dummy aidhh_dummy hhsize_simple nchild_simple i.wave i.g > or_dv) nodiffsargan robust small Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm. 1b.age_group_destr dropped due to collinearity 7.age_group_destr dropped due to collinearity 1b.wave dropped due to collinearity 3.wave dropped due to collinearity 1b.gor_dv dropped due to collinearity (sum of weights is 22647.4695) Warning: Two-step estimated covariance matrix of moments is singular. Using a generalized inverse to calculate robust weighting matrix for Hansen test. Dynamic panel-data estimation, one-step system GMM ------------------------------------------------------------------------------ Group variable: pidp Number of obs = 23378 Time variable : wave Number of groups = 6334 Number of instruments = 53 Obs per group: min = 1 F(., 6333) = . avg = 3.69 Prob > F = . max = 4 ----------------------------------------------------------------------------------------------- | Robust lfsato | Coefficient std. err. t P>|t| [95% conf. interval] ------------------------------+---------------------------------------------------------------- laglfsato3 | .083551 .0159013 5.25 0.000 .0523791 .1147229 NSC_index_ | .1831389 .0231104 7.92 0.000 .1378347 .2284431 fihhmngrs1_dv | 7.19e-06 5.31e-06 1.35 0.176 -3.22e-06 .0000176 | age_group_destr | 18-24 | .1436355 .1315885 1.09 0.275 -.1143224 .4015935 25-34 | -.0245395 .1061021 -0.23 0.817 -.2325355 .1834565 35-44 | -.1553759 .0900446 -1.73 0.084 -.3318939 .021142 45-54 | -.2298868 .0702192 -3.27 0.001 -.3675402 -.0922334 55-64 | -.1802464 .0472028 -3.82 0.000 -.2727798 -.087713 | age2 | .0000259 .0000257 1.01 0.314 -.0000245 .0000763 jbstat_simple | -.0034113 .0083322 -0.41 0.682 -.0197452 .0129225 edu_simple | -.0472176 .012247 -3.86 0.000 -.0712258 -.0232093 mastat_simple | -.0096751 .0375167 -0.26 0.797 -.0832206 .0638704 tenure_dummy | .0607223 .028808 2.11 0.035 .0042489 .1171958 addrmov_dummy | .1015179 .0492164 2.06 0.039 .0050371 .1979986 aidhh_dummy | -.1121506 .0495233 -2.26 0.024 -.209233 -.0150682 hhsize_simple | -.034727 .02944 -1.18 0.238 -.0924393 .0229853 nchild_simple | .017693 .0200045 0.88 0.376 -.0215226 .0569086 scsf1_combined_r | .1988844 .0238085 8.35 0.000 .1522117 .2455571 sf12mcs_dv | .0488631 .0020779 23.52 0.000 .0447897 .0529364 | wave | 2 | -.0387835 .0257759 -1.50 0.132 -.089313 .0117459 4 | .0262173 .0260069 1.01 0.313 -.0247652 .0771997 5 | .1596027 .0265361 6.01 0.000 .107583 .2116224 | gor_dv | north west | -.011139 .0671737 -0.17 0.868 -.1428223 .1205443 yorkshire and the humber .. | .0114451 .0709075 0.16 0.872 -.1275576 .1504478 east midlands | .0307968 .0670911 0.46 0.646 -.1007246 .1623181 west midlands | .0510337 .0690321 0.74 0.460 -.0842926 .18636 east of england | .0066081 .0640761 0.10 0.918 -.1190027 .132219 london | -.1073267 .0741767 -1.45 0.148 -.2527381 .0380847 south east | .0033076 .0631424 0.05 0.958 -.120473 .1270882 south west | .0132228 .0638898 0.21 0.836 -.1120228 .1384685 wales | .0324638 .0703395 0.46 0.644 -.1054255 .1703531 scotland | -.0823215 .0720415 -1.14 0.253 -.2235473 .0589042 northern ireland | .0762113 .0860584 0.89 0.376 -.0924923 .2449148 | _cons | 1.236654 .2329817 5.31 0.000 .7799315 1.693377 ----------------------------------------------------------------------------------------------- Instruments for first differences equation Standard D.(NSC_index_ 1b.age_group_destr 2.age_group_destr 3.age_group_destr 4.age_group_destr 5.age_group_destr 6.age_group_destr 7.age_group_destr age2 jbstat_simple edu_simple tenure_dummy addrmov_dummy aidhh_dummy hhsize_simple nchild_simple 1b.wave 2.wave 3.wave 4.wave 5.wave 1b.gor_dv 2.gor_dv 3.gor_dv 4.gor_dv 5.gor_dv 6.gor_dv 7.gor_dv 8.gor_dv 9.gor_dv 10.gor_dv 11.gor_dv 12.gor_dv) GMM-type (missing=0, separate instruments for each period unless collapsed) L(1/4).(laglfsato3 fihhmngrs1_dv mastat_simple scsf1_combined_r sf12mcs_dv) collapsed Instruments for levels equation Standard NSC_index_ 1b.age_group_destr 2.age_group_destr 3.age_group_destr 4.age_group_destr 5.age_group_destr 6.age_group_destr 7.age_group_destr age2 jbstat_simple edu_simple tenure_dummy addrmov_dummy aidhh_dummy hhsize_simple nchild_simple 1b.wave 2.wave 3.wave 4.wave 5.wave 1b.gor_dv 2.gor_dv 3.gor_dv 4.gor_dv 5.gor_dv 6.gor_dv 7.gor_dv 8.gor_dv 9.gor_dv 10.gor_dv 11.gor_dv 12.gor_dv _cons GMM-type (missing=0, separate instruments for each period unless collapsed) D.(laglfsato3 fihhmngrs1_dv mastat_simple scsf1_combined_r sf12mcs_dv) collapsed ------------------------------------------------------------------------------ Arellano-Bond test for AR(1) in first differences: z = -24.57 Pr > z = 0.000 Arellano-Bond test for AR(2) in first differences: z = 0.69 Pr > z = 0.490 ------------------------------------------------------------------------------ Sargan test of overid. restrictions: chi2(19) = 41.38 Prob > chi2 = 0.002 (Not robust, but not weakened by many instruments.) Hansen test of overid. restrictions: chi2(19) = 24.70 Prob > chi2 = 0.171 (Robust, but weakened by many instruments.)
Q3. I’m a bit concerned that the sargan test is still significant. Should I try and reduce the number of exogenous instruments or lags in my model? Or, is there an alternative way to address this issue?
Thank you in advance for any advice or guidance you may be able to provide. I am very new to statistics and have spent a lot of time reading the documentation in stata and the empirical literature on how to best use GMM but would love some clarification on the above.
Paper cited: Piper, Alan. (2023). What Does Dynamic Panel Analysis Tell Us About Life Satisfaction?. Review of Income and Wealth. 10.1111/roiw.12567.
Many thanks,
Emma
Comment