Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Subpop issue with svylorenz in EU SILC

    Hi everyone,
    I am working with 2007 EU SILC microdata to compute several inequality measures on equivalized disposable household income (I.e. eq_disp_HHincome).
    I am interesting in computing Standard Errors for Gini coefficients.
    I am aware of missing survey design information in EU SILC and already applied Goedeme' do files for reconstructing, to the extent possible, the survey design as suggested in:
    Zardo Trindade, L. and Goedemé, T. (2016) Notes on updating the EU-SILC UDB sample design variables 2012-2014, CSB Working Paper 16/02, Antwerp: Herman Deleeck Centre for Social Policy, University of Antwerp
    I have thus declared my survey design characteristics in stata as following:
    svyset psu1 [pweight=RB050], strata(strata1)
    I am using Jenkin's svylorenz command ( Stephen P. Jenkins (September 2015)) to compute Gini coefficients and their standard errors while taking into account the full characteristics of survey design.
    For a correct computation of the standard error I am also using the subpop option.
    However I run into problems when computing the Gini for Bulgaria with stata prompting the following error message:
    Code:
    svylorenz eq_disp_HHincome, subpop(flagBG_2007_HHincome)
     
    Warning: eq_disp_HHincome has 3346 values < 0. Not used in calculations
     
    Warning: eq_disp_HHincome has 1824 values = 0. Used in calculations
    no observations in subpop() subpopulation
    subpop() = 1 indicates observation in subpopulation
    subpop() = 0 indicates observation not in subpopulation
    r(461);
    From a simple tabulation the flag variable used for subpopolation(flagBG_2007_HHincome) is not empty:
    Code:
    tab flagBG_2007_HHincome
    
    flagBG_2007 |
      _HHincome |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |    586,464       97.99       97.99
              1 |     12,052        2.01      100.00
    ------------+-----------------------------------
          Total |    598,516      100.00
    
    
    
    sum  eq_disp_HHincome if flagBG_2007_HHincome == 1, de
    
                  Equivalized Disposable HH income
    -------------------------------------------------------------
          Percentiles      Smallest
     1%     61.97576        -182.02
     5%      338.025        -101.75
    10%     533.4556        -101.75       Obs              12,052
    25%     904.8649        -101.75       Sum of Wgt.      12,052
    
    50%     1405.393                      Mean           1629.077
                            Largest       Std. Dev.      1472.137
    75%     2018.672       40487.44
    90%      2804.99       40487.44       Variance        2167186
    95%     3429.388       40487.44       Skewness       12.05897
    99%      5886.59       40487.44       Kurtosis       289.1333
    I have thus started exploring the results for other countries in the dataset and notices how the following two specification of the command produce different gini estimates:
    Code:
    svylorenz0 eq_disp_HHincome if flagBE_2007_HHincome == 1
    svylorenz0 eq_disp_HHincome, subpop(flagBE_2007_HHincome)
    My understanding was that the subpopulation option should only have an impact on the standard errors and not on the estimated coefficient.
    Would you have any ideas on where I am going wrong?
    Thanks a lot,
    Luca

  • #2
    To be honest, I have no idea what is going on. First, check you have the latest version (from SSC)

    Code:
    . which svylorenz
    d:\home\stephenj\ado\stbplus\s\svylorenz.ado
    *! 3.1.0 SPJ September 2015. Fixed bug in SE calculations for Lorenz and shares (thanks to Ben Jann) 
    *! 3.0.0 SPJ September 2006.  Added Generalized Lorenz estimation; fixed bug with subpop()
    *! version 2.1.0 Stephen P. Jenkins, Nov 2005
    *!   Variance estimation for Gini, income shares,
    *!   cumulative shares (Lorenz curve ordinates) and generalized Lorenz ordinates
    The following seems to work OK -- or least as you expected to:

    Code:
    . sysuse auto , clear
    (1978 Automobile Data)
    
    . svyset, srs
    
          pweight: <none>
              VCE: linearized
      Single unit: missing
         Strata 1: <one>
             SU 1: <observations>
            FPC 1: <zero>
    
    . svylorenz mpg, subpop(foreign) ngp(2)
    
    
    Quantile group shares, cumulative shares (Lorenz ordinates),
    generalized Lorenz ordinates, and Gini
     
    Number of strata =          1               Number of obs    =           74
    Number of PSUs   =         74               Population size  =        74.00
                                                Subpop. no. obs  =           22
                                                Subpop. size     =        22.00
                                                Design df        =           73
     
    ---------------------------------------------------------------------------
      Group  |             Linearized
      share  |   Estimate   Std. Err.     z      P>|z|     [95% Conf. Interval]
    ---------+-----------------------------------------------------------------
        1    |   0.401835   0.017067   23.544    0.000       .368384    .435286
        2    |   0.598165   0.017067   35.048    0.000       .564714    .631616
    ---------+-----------------------------------------------------------------
      Cumul. |
      share  |
        1    |   0.401835   0.017067   23.544    0.000       .368384    .435286
        2    |   1.000000
    ---------+-----------------------------------------------------------------
      Gen.   |
      Lorenz |
        1    |      9.955      0.646   15.421    0.000         8.689     11.220
        2    |     24.773      1.387   17.867    0.000        22.055     27.490
    ---------+-----------------------------------------------------------------
      Gini   |  0.1433695  .022994    6.235    0.000        .0983021   .1884369
    ---------------------------------------------------------------------------
    
    . svylorenz mpg if foreign == 1, ngp(2)
    
    
    Quantile group shares, cumulative shares (Lorenz ordinates),
    generalized Lorenz ordinates, and Gini
     
    Number of strata =          1               Number of obs    =           22
    Number of PSUs   =         22               Population size  =        22.00
                                                Design df        =           21
     
    ---------------------------------------------------------------------------
      Group  |             Linearized
      share  |   Estimate   Std. Err.     z      P>|z|     [95% Conf. Interval]
    ---------+-----------------------------------------------------------------
        1    |   0.401835   0.017350   23.160    0.000       .367829    .435841
        2    |   0.598165   0.017350   34.476    0.000       .564159    .632171
    ---------+-----------------------------------------------------------------
      Cumul. |
      share  |
        1    |   0.401835   0.017350   23.160    0.000       .367829    .435841
        2    |   1.000000
    ---------+-----------------------------------------------------------------
      Gen.   |
      Lorenz |
        1    |      9.955      0.656   15.169    0.000         8.668     11.241
        2    |     24.773      1.410   17.575    0.000        22.010     27.535
    ---------+-----------------------------------------------------------------
      Gini   |  0.1433695  .02184674    6.563    0.000      .1005507   .1861883
    ---------------------------------------------------------------------------
    Maybe there is something odd going on because of the negative and zero income values and how they interact with the country flag? What if you define a different country flag that identifies observations in a given country that have income values > 0?

    Comment


    • #3
      Thanks a lot for your reply. I unistalled svylorenz and reinstalled but the problem seems to persist.
      From there it seems that I am using the latest version:
      Code:
      which svylorenz
      c:\ado\plus\s\svylorenz.ado
      *! 3.1.0 SPJ September 2015. Fixed bug in SE calculations for Lorenz and shares (thanks to Ben Jann)  
      *! 3.0.0 SPJ September 2006.  Added Generalized Lorenz estimation; fixed bug with subpop()
      *! version 2.1.0 Stephen P. Jenkins, Nov 2005
      *!   Variance estimation for Gini, income shares,
      *!   cumulative shares (Lorenz curve ordinates) and generalized Lorenz ordinates
      I have also explored it with having a country flag variable with strictly positive values and also for other countries.
      Below there are some tabulations for France (All income values) that might help illustrate the problem further:
      Code:
      svyset psu1 [pweight=RB050], strata(strata1) //This refers to individuals
      
      gen flagFR_2007_HHincome = 0 if YEAR == 2007 & eq_disp_HHincome !=.
      (1,259,847 missing values generated)
      
      replace flagFR_2007_HHincome = 1 if YEAR == 2007 & COUNTRY == "FR" & eq_disp_HHincome !=.
      (25,907 real changes made)
      
      tab flagFR_2007_HHincome YEAR, m
      
      flagFR_200 |               YEAR
      7_HHincome |      2007       2012       2016 |     Total
      -----------+---------------------------------+----------
               0 |   572,609          0          0 |   572,609 
               1 |    25,907          0          0 |    25,907 
               . |       534    612,946    646,367 | 1,259,847 
      -----------+---------------------------------+----------
           Total |   599,050    612,946    646,367 | 1,858,363 
      
      
                 |       flagFR_2007_HHincome
         COUNTRY |         0          1          . |     Total
      -----------+---------------------------------+----------
              AT |    16,684          0     26,959 |    43,643 
              BE |    15,493          0     27,724 |    43,217 
              BG |    12,052          0     32,777 |    44,829 
              CH |    15,951          0     35,344 |    51,295 
              CY |    10,630          0     24,615 |    35,245 
              CZ |    23,059          0     39,202 |    62,261 
              DE |    31,709          0     54,637 |    86,346 
              DK |    14,887          0     26,993 |    41,880 
              EE |    14,372          0     29,450 |    43,822 
              EL |    14,793          0     57,963 |    72,756 
              ES |    34,586          0     70,002 |   104,588 
              FI |    27,454          0     51,353 |    78,807 
              FR |         0     25,907     55,181 |    81,088 
              HR |         0          0     34,881 |    34,881 
              HU |    22,297          0     47,236 |    69,533 
              IE |    13,691          0     25,077 |    38,768 
              IS |     8,651          0      8,985 |    17,636 
              IT |    52,772          0     95,681 |   148,453 
              LT |    12,777          0     23,582 |    36,359 
              LU |    10,419          0     26,316 |    36,735 
              LV |    11,209          0     29,068 |    40,277 
              MT |    10,249          0     22,672 |    32,921 
              NL |    25,905          0     54,520 |    80,425 
              NO |    15,140          0     32,432 |    47,572 
              PL |    42,852          0     69,732 |   112,584 
              PT |    11,691          0     42,530 |    54,221 
              RO |    19,790          0     35,250 |    55,040 
              RS |         0          0     17,720 |    17,720 
              SE |    18,126          0     30,663 |    48,789 
              SI |    28,570          0     53,701 |    82,271 
              SK |    14,858          0     31,976 |    46,834 
              UK |    21,942          0     45,625 |    67,567 
      -----------+---------------------------------+----------
           Total |   572,609     25,907  1,259,847 | 1,858,363 
      
      
      sum eq_disp_HHincome if flagFR_2007_HHincome ==1, d
      
                    Equivalized Disposable HH income
      -------------------------------------------------------------
            Percentiles      Smallest
       1%       4632.8       -39686.5
       5%         7752       -39686.5
      10%       9289.5       -39686.5       Obs              25,907
      25%        12424      -18351.54       Sum of Wgt.      25,907
      
      50%        16843                      Mean            18946.7
                              Largest       Std. Dev.      11838.67
      75%     22732.67         179950
      90%     30384.67       520958.5       Variance       1.40e+08
      95%      36151.2       520958.5       Skewness       11.03533
      99%     54942.78       520958.5       Kurtosis       390.9694
      
      
      svylorenz eq_disp_HHincome, subpop(flagFR_2007_HHincome)
       
      Warning: eq_disp_HHincome has 3346 values < 0. Not used in calculations
       
      Warning: eq_disp_HHincome has 1824 values = 0. Used in calculations
      
      
      Quantile group shares, cumulative shares (Lorenz ordinates), 
      generalized Lorenz ordinates, and Gini
       
      Number of strata =         22               Number of obs    =        25868
      Number of PSUs   =       9011               Population size  =  59800320.81
                                                  Subpop. no. obs  =        25868
                                                  Subpop. size     =  59800320.81
                                                  Design df        =         8989
       
      ---------------------------------------------------------------------------
        Group  |             Linearized
        share  |   Estimate   Std. Err.     z      P>|z|     [95% Conf. Interval]
      ---------+-----------------------------------------------------------------
          1    |   0.000923   0.000605    1.524    0.127     -.0002637   .0021097
          2    |   0.010777   0.001227    8.782    0.000       .008372   .0131827
          3    |   0.054891   0.000698   78.669    0.000      .0535233   .0562585
          4    |   0.102495   0.001010  101.476    0.000       .100516    .104475
          5    |   0.134011   0.001667   80.388    0.000       .130744    .137279
          6    |   0.151368   0.001953   77.488    0.000        .14754    .155197
          7    |   0.148233   0.001527   97.057    0.000        .14524    .151226
          8    |   0.136214   0.000821  165.919    0.000       .134605    .137823
          9    |   0.129085   0.000988  130.600    0.000       .127148    .131022
          10   |   0.132002   0.004145   31.846    0.000       .123878    .140126
      ---------+-----------------------------------------------------------------
        Cumul. |
        share  |
          1    |   0.000923   0.000605    1.524    0.127     -.0002637   .0021097
          2    |   0.011700   0.001825    6.410    0.000      .0081226   .0152782
          3    |   0.066591   0.002378   28.005    0.000      .0619309   .0712517
          4    |   0.169086   0.002028   83.390    0.000       .165112    .173061
          5    |   0.303098   0.001851  163.757    0.000        .29947    .306725
          6    |   0.454466   0.002910  156.192    0.000       .448763    .460169
          7    |   0.602699   0.004097  147.095    0.000       .594668     .61073
          8    |   0.738913   0.004540  162.744    0.000       .730014    .747812
          9    |   0.867998   0.004145  209.410    0.000       .859874    .876122
          10   |   1.000000
      ---------+-----------------------------------------------------------------
        Gen.   |
        Lorenz |
          1    |     17.005     11.170    1.522    0.128        -4.888     38.898
          2    |    215.561     33.862    6.366    0.000       149.191    281.930
          3    |   1226.838     45.265   27.103    0.000      1138.120   1315.557
          4    |   3115.149     41.809   74.510    0.000      3033.205   3197.092
          5    |   5584.091     41.466  134.666    0.000      5502.819   5665.363
          6    |   8372.811     59.367  141.034    0.000      8256.453   8489.169
          7    |  11103.769     81.609  136.061    0.000     10943.818  11263.719
          8    |  13613.294     95.311  142.830    0.000     13426.488  13800.100
          9    |  15991.478    101.790  157.103    0.000     15791.973  16190.982
          10   |  18423.409    127.389  144.623    0.000     18173.730  18673.087
      ---------+-----------------------------------------------------------------
        Gini   |  3.7150118  .22427759   16.564    0.000      3.275436   4.154588
      ---------------------------------------------------------------------------
      
      
      svylorenz eq_disp_HHincome if flagFR_2007_HHincome == 1
       
      Warning: eq_disp_HHincome has 39 values < 0. Not used in calculations
      
      
      Quantile group shares, cumulative shares (Lorenz ordinates), 
      generalized Lorenz ordinates, and Gini
       
      Number of strata =         22               Number of obs    =        25868
      Number of PSUs   =       9011               Population size  =  59800320.81
                                                  Design df        =         8989
       
      ---------------------------------------------------------------------------
        Group  |             Linearized
        share  |   Estimate   Std. Err.     z      P>|z|     [95% Conf. Interval]
      ---------+-----------------------------------------------------------------
          1    |   0.039145   0.000562   69.615    0.000      .0380428    .040247
          2    |   0.055923   0.000487  114.943    0.000      .0549698    .056877
          3    |   0.066275   0.000457  145.003    0.000      .0653797   .0671713
          4    |   0.075325   0.000463  162.577    0.000      .0744174   .0762336
          5    |   0.084344   0.000486  173.721    0.000      .0833924   .0852956
          6    |   0.093930   0.000490  191.544    0.000      .0929688    .094891
          7    |   0.105121   0.000541  194.378    0.000       .104061    .106181
          8    |   0.119221   0.000608  196.237    0.000        .11803    .120412
          9    |   0.141988   0.000798  177.843    0.000       .140423    .143553
          10   |   0.218727   0.002932   74.597    0.000        .21298    .224473
      ---------+-----------------------------------------------------------------
        Cumul. |
        share  |
          1    |   0.039145   0.000562   69.615    0.000      .0380428    .040247
          2    |   0.095068   0.000948  100.250    0.000      .0932097    .096927
          3    |   0.161344   0.001302  123.920    0.000       .158792    .163896
          4    |   0.236669   0.001641  144.229    0.000       .233453    .239885
          5    |   0.321013   0.001967  163.226    0.000       .317159    .324868
          6    |   0.414943   0.002285  181.569    0.000       .410464    .419422
          7    |   0.520064   0.002589  200.905    0.000       .514991    .525138
          8    |   0.639285   0.002828  226.080    0.000       .633743    .644827
          9    |   0.781273   0.002932  266.455    0.000       .775527     .78702
          10   |   1.000000
      ---------+-----------------------------------------------------------------
        Gen.   |
        Lorenz |
          1    |    721.183     11.085   65.061    0.000       699.457    742.908
          2    |   1751.483     19.306   90.724    0.000      1713.645   1789.321
          3    |   2972.503     27.009  110.055    0.000      2919.566   3025.441
          4    |   4360.256     34.487  126.433    0.000      4292.663   4427.849
          5    |   5914.160     42.829  138.087    0.000      5830.216   5998.104
          6    |   7644.669     50.539  151.263    0.000      7545.615   7743.723
          7    |   9581.359     59.807  160.206    0.000      9464.140   9698.578
          8    |  11777.812     70.454  167.171    0.000     11639.725  11915.898
          9    |  14393.718     84.410  170.521    0.000     14228.276  14559.159
          10   |  18423.409    127.389  144.623    0.000     18173.730  18673.087
      ---------+-----------------------------------------------------------------
        Gini   |  0.2641288  .00336974   78.383    0.000      .2575242   .2707333
      ---------------------------------------------------------------------------

      Comment


      • #4
        As I said, I really don't know what is going on here. My suspicion remains that it is something to do with differential treatment of negative and zero values across different syntax statements. What's particularly strange now in your results for FR is
        Code:
         
         svylorenz eq_disp_HHincome, subpop(flagFR_2007_HHincome)
        yields a Gini of 3.7150118 whereas
        Code:
         
         svylorenz eq_disp_HHincome if flagFR_2007_HHincome == 1
        yields a Gini of 0.2641288. And we know you can't get a Gini of more than 1 unless negative values are included. Also notice that the two calls to svylorenz lead to different messages back regarding the numbers of zeros and negatives.

        What do you get by running the following?

        Code:
        ineqdeco eq_disp_HHincome  if flagFR_2007_HHincome == 1
        ineqdec0 eq_disp_HHincome  if flagFR_2007_HHincome == 1
        
        ineqdeco eq_disp_HHincome  [aw = RB050] if flagFR_2007_HHincome == 1
        ineqdec0 eq_disp_HHincome  [aw = RB050] if flagFR_2007_HHincome == 1
        Also what happens if you ignore your complex survey design and instead start with (say)
        Code:
         
         svyset , srs
        and then re-run.
        Or e.g.
        Code:
         
         svyset psu1 [pweight=RB050]
        Etc. I.e. is it something weird about relationship between zeros and negatives across PSUs and strata?


        I've never struck the problems you are observing in my testing of svylorenz -- that's simply an observation -- which explains why I am puzzled. Please note that other commitments mean that I may not be able to get back to this issue quickly.

        Comment


        • #5
          PS More fundamental question: why are treating each country as a sub-population? In EU-SILC data, each country's data is an independent sample. Surely using if country == XX is fine in this context where for convenience you have pooled the data for all the countries?

          Comment


          • #6
            Thanks a lot for all your timely replies.
            You have a very good point in using the if option instead of subpopulation which is definitely not needed.
            When using the if option the results are in line with expectations.
            However subpop and if keep behaving in two very different ways..
            As requested below results for:

            Code:
            svyset psu1 [pweight=RB050], strata(strata1)
            
                  pweight: RB050
                      VCE: linearized
              Single unit: missing
                 Strata 1: strata1
                     SU 1: psu1
                    FPC 1: <zero>
            
            ineqdeco eq_disp_HHincome  if flagFR_2007_HHincome == 1
             
            Warning: eq_disp_HHincome has 39 values < 0. Not used in calculations
             
            Percentile ratios
            
            ----------------------------------------------------------
              All obs |    p90/p10     p90/p50     p10/p50     p75/p25
            ----------+-----------------------------------------------
                      |      3.262       1.803       0.553       1.827
            ----------------------------------------------------------
              
            Generalized Entropy indices GE(a), where a = income difference
             sensitivity parameter, and Gini coefficient
            
            ----------------------------------------------------------------------
              All obs |     GE(-1)       GE(0)       GE(1)       GE(2)        Gini
            ----------+-----------------------------------------------------------
                      |    0.16079     0.12290     0.13067     0.19311     0.26833
            ----------------------------------------------------------------------
               
            Atkinson indices, A(e), where e > 0 is the inequality aversion parameter
            
            ----------------------------------------------
              All obs |     A(0.5)        A(1)        A(2)
            ----------+-----------------------------------
                      |    0.06062     0.11564     0.24333
            ----------------------------------------------
            
            . ineqdec0 eq_disp_HHincome  if flagFR_2007_HHincome == 1
             
            Warning: eq_disp_HHincome has 39 values < 0. Used in calculations
             
            Percentile ratios
            
            ----------------------------------------------------------
              All obs |    p90/p10     p90/p50     p10/p50     p75/p25
            ----------+-----------------------------------------------
                      |      3.271       1.804       0.552       1.830
            ----------------------------------------------------------
              
            Generalized Entropy index GE(2), and Gini coefficient
            
            ----------------------------------
              All obs |      GE(2)        Gini
            ----------+-----------------------
                      |    0.19521     0.27008
            ----------------------------------
            
            ineqdeco eq_disp_HHincome  [aw = RB050] if flagFR_2007_HHincome == 1
             
            Warning: eq_disp_HHincome has 39 values < 0. Not used in calculations
             
            Percentile ratios
            
            ----------------------------------------------------------
              All obs |    p90/p10     p90/p50     p10/p50     p75/p25
            ----------+-----------------------------------------------
                      |      3.195       1.784       0.558       1.788
            ----------------------------------------------------------
              
            Generalized Entropy indices GE(a), where a = income difference
             sensitivity parameter, and Gini coefficient
            
            ----------------------------------------------------------------------
              All obs |     GE(-1)       GE(0)       GE(1)       GE(2)        Gini
            ----------+-----------------------------------------------------------
                      |    0.15123     0.11891     0.12599     0.17923     0.26413
            ----------------------------------------------------------------------
               
            Atkinson indices, A(e), where e > 0 is the inequality aversion parameter
            
            ----------------------------------------------
              All obs |     A(0.5)        A(1)        A(2)
            ----------+-----------------------------------
                      |    0.05870     0.11212     0.23223
            ----------------------------------------------
            
            ineqdec0 eq_disp_HHincome  [aw = RB050] if flagFR_2007_HHincome == 1
             
            Warning: eq_disp_HHincome has 39 values < 0. Used in calculations
             
            Percentile ratios
            
            ----------------------------------------------------------
              All obs |    p90/p10     p90/p50     p10/p50     p75/p25
            ----------+-----------------------------------------------
                      |      3.220       1.786       0.555       1.790
            ----------------------------------------------------------
              
            Generalized Entropy index GE(2), and Gini coefficient
            
            ----------------------------------
              All obs |      GE(2)        Gini
            ----------+-----------------------
                      |    0.18202     0.26649
            ----------------------------------
            As far as I can tell they seem okay. The results below explore what happens to svylorenz when using simple survey design:

            Code:
            .  svyset , srs
            
                  pweight: <none>
                      VCE: linearized
              Single unit: missing
                 Strata 1: <one>
                     SU 1: <observations>
                    FPC 1: <zero>
            
            . svylorenz eq_disp_HHincome if flagFR_2007_HHincome == 1
             
            Warning: eq_disp_HHincome has 39 values < 0. Not used in calculations
            
            
            Quantile group shares, cumulative shares (Lorenz ordinates), 
            generalized Lorenz ordinates, and Gini
             
            Number of strata =          1               Number of obs    =        25868
            Number of PSUs   =      25868               Population size  =     25868.00
                                                        Design df        =        25867
             
            ---------------------------------------------------------------------------
              Group  |             Linearized
              share  |   Estimate   Std. Err.     z      P>|z|     [95% Conf. Interval]
            ---------+-----------------------------------------------------------------
                1    |   0.038641   0.000265  145.825    0.000      .0381219   .0391606
                2    |   0.054983   0.000249  221.226    0.000      .0544958     .05547
                3    |   0.065583   0.000250  262.070    0.000      .0650924   .0660733
                4    |   0.074733   0.000264  283.361    0.000      .0742159   .0752498
                5    |   0.084097   0.000281  299.740    0.000      .0835468   .0846466
                6    |   0.093531   0.000292  320.685    0.000      .0929593   .0941026
                7    |   0.105070   0.000321  327.255    0.000       .104441    .105699
                8    |   0.120127   0.000372  323.048    0.000       .119398    .120856
                9    |   0.143829   0.000461  311.806    0.000       .142925    .144733
                10   |   0.219407   0.001933  113.523    0.000       .215619    .223195
            ---------+-----------------------------------------------------------------
              Cumul. |
              share  |
                1    |   0.038641   0.000265  145.825    0.000      .0381219   .0391606
                2    |   0.093624   0.000465  201.524    0.000      .0927136   .0945347
                3    |   0.159207   0.000665  239.296    0.000       .157903    .160511
                4    |   0.233940   0.000873  267.823    0.000       .232228    .235652
                5    |   0.318037   0.001090  291.905    0.000       .315901    .320172
                6    |   0.411567   0.001307  314.841    0.000       .409005     .41413
                7    |   0.516637   0.001528  338.159    0.000       .513643    .519632
                8    |   0.636764   0.001741  365.832    0.000       .633353    .640176
                9    |   0.780593   0.001933  403.885    0.000       .776805    .784381
                10   |   1.000000
            ---------+-----------------------------------------------------------------
              Gen.   |
              Lorenz |
                1    |    733.603      5.008  146.494    0.000       723.788    743.418
                2    |   1777.450      8.689  204.570    0.000      1760.420   1794.480
                3    |   3022.537     12.272  246.298    0.000      2998.485   3046.589
                4    |   4441.336     16.092  275.989    0.000      4409.796   4472.877
                5    |   6037.907     20.265  297.952    0.000      5998.189   6077.625
                6    |   7813.588     24.498  318.946    0.000      7765.572   7861.604
                7    |   9808.336     29.520  332.258    0.000      9750.478   9866.195
                8    |  12088.937     35.811  337.574    0.000     12018.749  12159.126
                9    |  14819.523     43.734  338.853    0.000     14733.806  14905.241
                10   |  18984.950     73.360  258.793    0.000     18841.167  19128.732
            ---------+-----------------------------------------------------------------
              Gini   |  0.2683277  .00201945  132.871    0.000      .2643696   .2722857
            ---------------------------------------------------------------------------
            
            . svylorenz eq_disp_HHincome, subpop(flagFR_2007_HHincome)
             
            Warning: eq_disp_HHincome has 3346 values < 0. Not used in calculations
             
            Warning: eq_disp_HHincome has 1824 values = 0. Used in calculations
            
            
            Quantile group shares, cumulative shares (Lorenz ordinates), 
            generalized Lorenz ordinates, and Gini
             
            Number of strata =          1               Number of obs    =       597461
            Number of PSUs   =     597461               Population size  =    597461.00
                                                        Subpop. no. obs  =        25868
                                                        Subpop. size     =     25868.00
                                                        Design df        =       597460
             
            ---------------------------------------------------------------------------
              Group  |             Linearized
              share  |   Estimate   Std. Err.     z      P>|z|     [95% Conf. Interval]
            ---------+-----------------------------------------------------------------
                1    |   0.000407   0.000107    3.808    0.000      .0001975   .0006164
                2    |   0.002329   0.000221   10.542    0.000       .001896    .002762
                3    |   0.010262   0.000319   32.148    0.000      .0096367    .010888
                4    |   0.047340   0.000325  145.620    0.000      .0467033   .0479776
                5    |   0.106103   0.000417  254.647    0.000       .105286     .10692
                6    |   0.166584   0.000656  254.004    0.000       .165298    .167869
                7    |   0.192329   0.000767  250.814    0.000       .190826    .193832
                8    |   0.182106   0.000676  269.424    0.000       .180781     .18343
                9    |   0.164192   0.000593  277.039    0.000        .16303    .165353
                10   |   0.128348   0.002078   61.770    0.000       .124276    .132421
            ---------+-----------------------------------------------------------------
              Cumul. |
              share  |
                1    |   0.000407   0.000107    3.808    0.000      .0001975   .0006164
                2    |   0.002736   0.000326    8.391    0.000      .0020969    .003375
                3    |   0.012998   0.000640   20.319    0.000      .0117445   .0142521
                4    |   0.060339   0.000914   66.044    0.000      .0585481   .0621294
                5    |   0.166442   0.001071  155.390    0.000       .164342    .168541
                6    |   0.333025   0.001215  274.110    0.000       .330644    .335407
                7    |   0.525354   0.001545  340.122    0.000       .522327    .528382
                8    |   0.707460   0.001899  372.460    0.000       .703737    .711183
                9    |   0.871652   0.002078  419.499    0.000       .867579    .875724
                10   |   1.000000
            ---------+-----------------------------------------------------------------
              Gen.   |
              Lorenz |
                1    |      7.726      2.029    3.808    0.000         3.749     11.702
                2    |     51.942      6.192    8.388    0.000        39.805     64.079
                3    |    246.772     12.153   20.306    0.000       222.953    270.591
                4    |   1145.528     17.301   66.213    0.000      1111.619   1179.436
                5    |   3159.886     20.120  157.050    0.000      3120.451   3199.321
                6    |   6322.471     22.668  278.913    0.000      6278.042   6366.900
                7    |   9973.826     30.065  331.746    0.000      9914.901  10032.752
                8    |  13431.091     40.566  331.096    0.000     13351.584  13510.598
                9    |  16548.260     50.711  326.328    0.000     16448.870  16647.651
                10   |  18984.950     73.358  258.798    0.000     18841.170  19128.729
            ---------+-----------------------------------------------------------------
              Gini   |  8.6508827  .31082878   27.832    0.000      8.041669   9.260096
            ---------------------------------------------------------------------------
            
            . svylorenz eq_disp_HHincome if flagBG_2007_HHincome == 1
             
            Warning: eq_disp_HHincome has 5 values < 0. Not used in calculations
            
            
            Quantile group shares, cumulative shares (Lorenz ordinates), 
            generalized Lorenz ordinates, and Gini
             
            Number of strata =          1               Number of obs    =        12047
            Number of PSUs   =      12047               Population size  =     12047.00
                                                        Design df        =        12046
             
            ---------------------------------------------------------------------------
              Group  |             Linearized
              share  |   Estimate   Std. Err.     z      P>|z|     [95% Conf. Interval]
            ---------+-----------------------------------------------------------------
                1    |   0.019843   0.000434   45.699    0.000      .0189916   .0206937
                2    |   0.041841   0.000487   85.870    0.000      .0408863   .0427964
                3    |   0.055653   0.000496  112.286    0.000      .0546816   .0566245
                4    |   0.067542   0.000537  125.888    0.000      .0664903   .0685934
                5    |   0.079860   0.000599  133.218    0.000      .0786852   .0810351
                6    |   0.092521   0.000649  142.497    0.000      .0912484   .0937936
                7    |   0.106205   0.000722  147.082    0.000        .10479     .10762
                8    |   0.124701   0.000851  146.619    0.000       .123034    .126368
                9    |   0.153029   0.001010  151.452    0.000       .151048    .155009
                10   |   0.258805   0.004464   57.981    0.000       .250056    .267553
            ---------+-----------------------------------------------------------------
              Cumul. |
              share  |
                1    |   0.019843   0.000434   45.699    0.000      .0189916   .0206937
                2    |   0.061684   0.000839   73.518    0.000      .0600396   .0633285
                3    |   0.117337   0.001231   95.308    0.000       .114924     .11975
                4    |   0.184879   0.001652  111.931    0.000       .181642    .188116
                5    |   0.264739   0.002123  124.697    0.000       .260578      .2689
                6    |   0.357260   0.002630  135.819    0.000       .352105    .362416
                7    |   0.463465   0.003187  145.405    0.000       .457218    .469712
                8    |   0.588166   0.003792  155.115    0.000       .580735    .595598
                9    |   0.741195   0.004464  166.054    0.000       .732447    .749944
                10   |   1.000000
            ---------+-----------------------------------------------------------------
              Gen.   |
              Lorenz |
                1    |     32.340      0.727   44.488    0.000        30.915     33.764
                2    |    100.532      1.405   71.550    0.000        97.779    103.286
                3    |    191.236      2.024   94.479    0.000       187.268    195.203
                4    |    301.315      2.654  113.529    0.000       296.113    306.517
                5    |    431.471      3.351  128.748    0.000       424.902    438.039
                6    |    582.261      4.036  144.277    0.000       574.351    590.171
                7    |    755.354      4.784  157.876    0.000       745.976    764.731
                8    |    958.591      5.765  166.266    0.000       947.291    969.891
                9    |   1207.997      6.888  175.375    0.000      1194.497   1221.498
                10   |   1629.796     13.411  121.523    0.000      1603.511   1656.082
            ---------+-----------------------------------------------------------------
              Gini   |  0.3488819  .00424563   82.174    0.000      .3405606   .3572032
            ---------------------------------------------------------------------------
            
            . svylorenz eq_disp_HHincome, subpop(flagBG_2007_HHincome)
             
            Warning: eq_disp_HHincome has 3346 values < 0. Not used in calculations
             
            Warning: eq_disp_HHincome has 1824 values = 0. Used in calculations
            no observations in subpop() subpopulation
            subpop() = 1 indicates observation in subpopulation
            subpop() = 0 indicates observation not in subpopulation
            r(461);
            It seems that the problem about the if and the subpopulation options persists

            Comment


            • #7
              Thanks, but you haven't yet shown that there is a "problem" in the sense of program bug. You've shown that you're getting some results that you don't expect. There might be a bug, of course, but it hasn't been identified yet (and naturally I hope it doesn't exist!)..

              Going forward, I think you should work only with the data for one country in any further bug-hunting checks. So, e.g. for France, work with a dataset in which you keep the observations for France only. Any code in which you 'select' a country via the subpop() option is going to divert you. That is not the right way to go about checking potential bugs. Currently I am of the view that, if there is a potential issue, it's to do with different samples being used to estimate the Gini.

              Put differently, I think I see in your output that (i) svylorenz in combination with a (pseudo-code) if country == "xx" qualifier and (ii) my other programs for estimating the Gini are giving the same Gini point estimate when applied to the same sample of observations (and the same weight is used)

              Next, investigate for your chosen country the overlap between missing and valid values for all the relevant variables such as the weight, PSU, and strata, and for income you also need to distinguish between negative/zero/positive/missing values. [You also need to clearer about whether each country flag variable is selecting all observations for a given country, or all observations in that country with non-missing income values (or something else).] These various cross-tabulations can be used to define a series of "test" samples for a given country. Application of the same estimation command to each of the different samples may help you identify whether there really is a 'problem' or help explain what is unexpected.

              The subpop() option is intended for providing (in your context) estimates for some within-country group, e.g. women, or some particular region, or age group. [To repeat: the option s not for selecting a country with an independent sample from a dataset in which data from many countries each with an independent sample have been pooled for convenience.]

              When using svylorenz to estimate a Gini, I recommend you use a "ngp(2)" option in order to get more concise output. Also, when seeking clues, use the knowledge that my various programs will provide the same estimate for the Gini if applied to the same sample of observations (and the same weight is used).

              Comment

              Working...
              X