mixed model with endogenous variable

Lucie Piaser

Join Date: Mar 2018
Posts: 9

mixed model with endogenous variable

06 Mar 2019, 03:53

Hello everyone,

I have multilevel data (individuals nested within municipalities). In my database I have information on individuals and on the municipality where they live. In the table there is an exemple of the structure of my data:

individual	age	sex	fear	municipality	homicide rate	gini	security spending
1	20	male	0.4	1	8	0.2	1000
2	25	male	0.2	1	8	0.2	1000
3	50	female	0.8	2	12	0.5	500
4	89	male	0.8	3	21	0.4	1200
5	75	male	0.4	3	21	0.4	1200
6	12	female	0.2	3	21	0.4	1200
7	54	female	0.1	4	17	0.3	3000
8	33	female	0.5	4	17	0.3	3000
9	60	female	0.7	4	17	0.3	3000

Overall, there are 740 different municipalities in my dataset, with a minimum of 19 and a maximum of 1700 individuals per municipality.

My main goal is to estimate the impact of income inequality of a municipality on the fear of its residents. However, I suspect the gini coefficient to be endogenous. As my data are hierarchical, I estimate my model using the mixed command (Stata 14) and 2SLS procedure.

I'm first regressing the gini coefficient on the instrumental variables and the municipality level controls. I then stored the estimated gini:

Code:

reg gini IV1 IV2 homicide_rate security_spending, cluster(municipality)

Code:

predict gini_est

One of my first question is: as my observations are at the individual level, when I'm estimating my first stage equation, results are potentially biased, as some municipalities will have 19 observations and others 1700. Is it enough to correct this problem by adding the cluster(municipality) option ?

Then I'm using my predicted gini to estimate the mixed model as follow:

Code:

mixed fear gini_est homicide_rate security_spending age sex || municipality: , vce(cluster municipality)

I added the vce(cluster municipality) option to obtain robust clustered standard errors.

My second question is: Does it seem correct to estimate my model this way, or could I improve something ?

Thank you very much,
Lucie

Tags: None

Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

07 Mar 2019, 14:29

Doing the instrumental variables manually can run into problems. Another option would be reghdfe which allows for endogenous variables and multi-dimensions of panels. However, it does only do fixed effects. Alternatively, you could use SEM/GSEM to explicitly model this. Custered standard errors fixes problems with the standard errors but generally does not fix problems with the consistency of the betas. If you're really worried about varying samples per municipality, a weighted estimator might be considered.
Comment
Lucie Piaser

Join Date: Mar 2018

Posts: 9
#3

08 Mar 2019, 02:09

Thanks for your suggestions. GSEM was my first choice but it often failed to converge, that's why I turned to 2SLS. I probably need to examine this possibility again. I never heard of reghdfe, I will take a look at it.
Comment

Announcement

mixed model with endogenous variable

Comment

Comment