Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • What's the rationale behind egen_inequality excluding zeros?

    I'm using the user-written egen_inequal to compute Gini coefficients. Like so:
    Code:
     egen land_gini = gini(owned_hectares), by(village_2003)
    It automatically excludes obs with 0 hectares and computes the Gini within the group of non-zero observations.

    I fail to see why the this would *ever* be desirable, but in particular I am convinced in the case of land and hectares it should be different.

    Why is this like that and what can I do to change this with egen_inequal?

    PS: If Zurab Sajaia or @Michael Lokshin don't want to work on the code anymore I am also happy to dive into the ado file.

  • #2
    You can see the .ado file: viewsource _ggini.ado
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Hi Simon,
      I think the main reason is that most inequality measures are not designed for handling zero or negative values, in particular when there is a large number of Zeros in the data.
      If you decide to go over the Gini ado, one option you have, that can handle Negatives and Zeros, is to use the Covariance Formula of Gini.
      Another option you have is to use "rifvar". This is an extension to egen that you can install using "ssc install oaxaca_rif".
      you could do something like:
      Code:
       
       egen land_gini_rif = rifvar(owned_hectares), by(village_2003) gini egen land_gini=mean(land_gini_rif), by(village_2003)
      HTH

      Comment


      • #4
        -ineqdec0- (SSC) estimates Gini coefficients and allows for zero or negative values. [Cf. -ineqdeco- (SSC) which also calculates Gini coefficients and other indices, but works with positive values.] Combine -ineqdec0- with -statsby-. Something like:

        Code:
        statsby gini = r(gini), by(village_2003) saving(my_ginis.dta, replace):   ineqdec0 owned_hectares
        and then merge the data in my_ginis.dta back onto your original data file using village_2003 as the merge key

        Comment


        • #5
          Thanks for all you useful answers

          Comment


          • #6
            Hi Stephen, I just found the same problem as Simon, and so I was glad to know that ineqdec0 can calculate the gini including zeros. Your command works perfectly for me if I run it separately from statsby (i.e., I can see the gini indices in the stata output), but when I try to export the r(gini) values by using your command above, it creates a matrix with all missing gini values. Any idea how I need to tweak your command? I looked at the statsby help, and it looks right to me.

            Comment


            • #7
              Leah: sorry, but I don't understand the problem in your case. (Your reference to "matrix" confuses me.) The following code does what I expect:

              Code:
              sysuse auto, clear
              statsby gini = r(gini), by(foreign) saving(junk.dta, replace):   ineqdec0 mpg
              use junk.dta
              describe
              list
              resulting in:

              Code:
              . sysuse auto, clear
              (1978 automobile data)
              
              . statsby gini = r(gini), by(foreign) saving(junk.dta, replace):   ineqdec0 mpg
              (running ineqdec0 on estimation sample)
              
                    Command: ineqdec0 mpg
                       gini: r(gini)
                         By: foreign
              
              Statsby groups
              ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
              ..
              
              . use junk.dta
              (statsby: ineqdec0)
              
              . describe
              
              Contains data from junk.dta
               Observations:             2                  statsby: ineqdec0
                  Variables:             2                  14 Sep 2022 09:18
              -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
              Variable      Storage   Display    Value
                  name         type    format    label      Variable label
              -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
              foreign         byte    %8.0g      origin     Car origin
              gini            float   %9.0g                 r(gini)
              -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
              Sorted by: foreign
              
              . list
              
                   +---------------------+
                   |  foreign       gini |
                   |---------------------|
                1. | Domestic   .1301388 |
                2. |  Foreign   .1433695 |
                   +---------------------+
              
              . 
              end of do-file

              Comment

              Working...
              X