Hello Statalisters,
I am currently working on my bachelor's thesis in development economics where I am investigating how the sub-indices of the global gender gap index are affected by the magnitude and structure of the female labor market. My sample consist of 82 countries over 11 years (2006-2016). At first we considered a fixed effects estimation and our control variables are sampled based on allowing correlation between the regressors and time constant unobserved variables.
The problem is that some sub-indices are characterized by having a large proportion of their values being 1 (meaning equality). A much as 10 % of all observations are equal to 1 in one of the sub-indices. Therefore I tried to find a way to estimate using the logit function to provide better fit. The two models estimated are the following (control variables excluded to keep it readable):
No major differences can be observed other than LFP being significant in the GEE model and not in the FE model. My question is, are these results even comparable? My main worry is that since I am using control variables for a fixed effects estimation, the GEE approach isn't reliable because of omitted time-constant variables. Is there any way to control for this in the GEE model, or should simply go back to the fixed effects and live with a worse model specification?
Regards,
Victor Fingal
I am currently working on my bachelor's thesis in development economics where I am investigating how the sub-indices of the global gender gap index are affected by the magnitude and structure of the female labor market. My sample consist of 82 countries over 11 years (2006-2016). At first we considered a fixed effects estimation and our control variables are sampled based on allowing correlation between the regressors and time constant unobserved variables.
The problem is that some sub-indices are characterized by having a large proportion of their values being 1 (meaning equality). A much as 10 % of all observations are equal to 1 in one of the sub-indices. Therefore I tried to find a way to estimate using the logit function to provide better fit. The two models estimated are the following (control variables excluded to keep it readable):
Code:
xtreg HEASUB LFP LFPsq IND INDsq SER SERsq, fe vce(robust) xtgee HEASUB LFP IND SER, vce(robust) link(logit) family(binomial) corr(exc)
Regards,
Victor Fingal