Hi everyone,
I am working with 2007 EU SILC microdata to compute several inequality measures on equivalized disposable household income (I.e. eq_disp_HHincome).
I am interesting in computing Standard Errors for Gini coefficients.
I am aware of missing survey design information in EU SILC and already applied Goedeme' do files for reconstructing, to the extent possible, the survey design as suggested in:
I have thus declared my survey design characteristics in stata as following:
I am using Jenkin's svylorenz command ( Stephen P. Jenkins (September 2015)) to compute Gini coefficients and their standard errors while taking into account the full characteristics of survey design.
For a correct computation of the standard error I am also using the subpop option.
However I run into problems when computing the Gini for Bulgaria with stata prompting the following error message:
From a simple tabulation the flag variable used for subpopolation(flagBG_2007_HHincome) is not empty:
I have thus started exploring the results for other countries in the dataset and notices how the following two specification of the command produce different gini estimates:
My understanding was that the subpopulation option should only have an impact on the standard errors and not on the estimated coefficient.
Would you have any ideas on where I am going wrong?
Thanks a lot,
Luca
I am working with 2007 EU SILC microdata to compute several inequality measures on equivalized disposable household income (I.e. eq_disp_HHincome).
I am interesting in computing Standard Errors for Gini coefficients.
I am aware of missing survey design information in EU SILC and already applied Goedeme' do files for reconstructing, to the extent possible, the survey design as suggested in:
Zardo Trindade, L. and Goedemé, T. (2016) Notes on updating the EU-SILC UDB sample design variables 2012-2014, CSB Working Paper 16/02, Antwerp: Herman Deleeck Centre for Social Policy, University of Antwerp
svyset psu1 [pweight=RB050], strata(strata1)
For a correct computation of the standard error I am also using the subpop option.
However I run into problems when computing the Gini for Bulgaria with stata prompting the following error message:
Code:
svylorenz eq_disp_HHincome, subpop(flagBG_2007_HHincome) Warning: eq_disp_HHincome has 3346 values < 0. Not used in calculations Warning: eq_disp_HHincome has 1824 values = 0. Used in calculations no observations in subpop() subpopulation subpop() = 1 indicates observation in subpopulation subpop() = 0 indicates observation not in subpopulation r(461);
Code:
tab flagBG_2007_HHincome flagBG_2007 | _HHincome | Freq. Percent Cum. ------------+----------------------------------- 0 | 586,464 97.99 97.99 1 | 12,052 2.01 100.00 ------------+----------------------------------- Total | 598,516 100.00 sum eq_disp_HHincome if flagBG_2007_HHincome == 1, de Equivalized Disposable HH income ------------------------------------------------------------- Percentiles Smallest 1% 61.97576 -182.02 5% 338.025 -101.75 10% 533.4556 -101.75 Obs 12,052 25% 904.8649 -101.75 Sum of Wgt. 12,052 50% 1405.393 Mean 1629.077 Largest Std. Dev. 1472.137 75% 2018.672 40487.44 90% 2804.99 40487.44 Variance 2167186 95% 3429.388 40487.44 Skewness 12.05897 99% 5886.59 40487.44 Kurtosis 289.1333
Code:
svylorenz0 eq_disp_HHincome if flagBE_2007_HHincome == 1 svylorenz0 eq_disp_HHincome, subpop(flagBE_2007_HHincome)
Would you have any ideas on where I am going wrong?
Thanks a lot,
Luca
Comment