Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Computing Between-Group Gini Coefficient

    Hi there,

    I am currently attempting to compute a Gini Coefficient that represents average house price inequality distribution between regions in England and Wales. Below is an example of my dataset where I have average house prices for each region from 1995-2020 set in time-series:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str16 region long(overallaverage totalsales) int year float regionid
    "EAST ANGLIA"     60737  36655 1995 1
    "EAST ANGLIA"     61966  44687 1996 1
    "EAST ANGLIA"     67454  51297 1997 1
    "EAST ANGLIA"     72643  49297 1998 1
    "EAST ANGLIA"     79702  56847 1999 1
    "EAST ANGLIA"     91973  52129 2000 1
    "EAST ANGLIA"    105473  58426 2001 1
    "EAST ANGLIA"    128118  59517 2002 1
    "EAST ANGLIA"    149170  54478 2003 1
    "EAST ANGLIA"    167584  54479 2004 1
    "EAST ANGLIA"    176912  48598 2005 1
    "EAST ANGLIA"    187761  61496 2006 1
    "EAST ANGLIA"    201768  57993 2007 1
    "EAST ANGLIA"    199497  30304 2008 1
    "EAST ANGLIA"    187905  33283 2009 1
    "EAST ANGLIA"    206096  33283 2010 1
    "EAST ANGLIA"    200764  34144 2011 1
    "EAST ANGLIA"    201124  33692 2012 1
    "EAST ANGLIA"    209853  39861 2013 1
    "EAST ANGLIA"    223322  46242 2014 1
    "EAST ANGLIA"    238962  44069 2015 1
    "EAST ANGLIA"    253156  44684 2016 1
    "EAST ANGLIA"    271376  43476 2017 1
    "EAST ANGLIA"    277075  41451 2018 1
    "EAST ANGLIA"    282633  40227 2019 1
    "EAST ANGLIA"    289293  23538 2020 1
    "EAST MIDLANDS"   54255  64603 1995 2
    "EAST MIDLANDS"   56269  76765 1996 2
    "EAST MIDLANDS"   60179  86470 1997 2
    "EAST MIDLANDS"   64263  84901 1998 2
    "EAST MIDLANDS"   69745  96911 1999 2
    "EAST MIDLANDS"   76292  95969 2000 2
    "EAST MIDLANDS"   85167 106683 2001 2
    "EAST MIDLANDS"  102393 114864 2002 2
    "EAST MIDLANDS"  125247 104085 2003 2
    "EAST MIDLANDS"  144797 102597 2004 2
    "EAST MIDLANDS"  153368  87508 2005 2
    "EAST MIDLANDS"  159609 108821 2006 2
    "EAST MIDLANDS"  168525 104295 2007 2
    "EAST MIDLANDS"  162888  54698 2008 2
    "EAST MIDLANDS"  157956  52169 2009 2
    "EAST MIDLANDS"  164865  54088 2010 2
    "EAST MIDLANDS"  161071  54002 2011 2
    "EAST MIDLANDS"  162646  54726 2012 2
    "EAST MIDLANDS"  166646  65346 2013 2
    "EAST MIDLANDS"  176548  78185 2014 2
    "EAST MIDLANDS"  187722  78550 2015 2
    "EAST MIDLANDS"  196948  82227 2016 2
    "EAST MIDLANDS"  208645  80694 2017 2
    "EAST MIDLANDS"  219028  79083 2018 2
    "EAST MIDLANDS"  223673  75031 2019 2
    "EAST MIDLANDS"  230411  43358 2020 2
    "GREATER LONDON"  97721 108784 1995 4
    "GREATER LONDON" 105464 135392 1996 4
    "GREATER LONDON" 119904 157651 1997 4
    "GREATER LONDON" 135136 148826 1998 4
    "GREATER LONDON" 158461 172168 1999 4
    "GREATER LONDON" 188779 152265 2000 4
    "GREATER LONDON" 204972 165597 2001 4
    "GREATER LONDON" 233361 177044 2002 4
    "GREATER LONDON" 251418 152479 2003 4
    "GREATER LONDON" 275701 159535 2004 4
    "GREATER LONDON" 290385 138584 2005 4
    "GREATER LONDON" 316156 173570 2006 4
    "GREATER LONDON" 352895 167588 2007 4
    "GREATER LONDON" 362072  82180 2008 4
    "GREATER LONDON" 362784  76168 2009 4
    "GREATER LONDON" 408359  92772 2010 4
    "GREATER LONDON" 421333  90785 2011 4
    "GREATER LONDON" 437671  94396 2012 4
    "GREATER LONDON" 473861 111306 2013 4
    "GREATER LONDON" 525342 118586 2014 4
    "GREATER LONDON" 545445 113290 2015 4
    "GREATER LONDON" 586035 103264 2016 4
    "GREATER LONDON" 620228  93334 2017 4
    "GREATER LONDON" 621524  87394 2018 4
    "GREATER LONDON" 628488  83210 2019 4
    "GREATER LONDON" 666467  51382 2020 4
    "NORTH"           50669  31634 1995 5
    "NORTH"           52598  39190 1996 5
    "NORTH"           55759  42667 1997 5
    "NORTH"           58323  43296 1998 5
    "NORTH"           61492  46808 1999 5
    "NORTH"           64642  48708 2000 5
    "NORTH"           70008  54378 2001 5
    "NORTH"           80752  60910 2002 5
    "NORTH"           99596  60822 2003 5
    "NORTH"          121049  57981 2004 5
    "NORTH"          131902  50194 2005 5
    "NORTH"          142929  59976 2006 5
    "NORTH"          150843  59983 2007 5
    "NORTH"          151776  30075 2008 5
    "NORTH"          145836  29693 2009 5
    "NORTH"          149366  31903 2010 5
    "NORTH"          141576  33002 2011 5
    "NORTH"          143808  32053 2012 5
    "NORTH"          145962  37061 2013 5
    "NORTH"          152961  43430 2014 5
    "NORTH"          157739  44777 2015 5
    "NORTH"          161128  45573 2016 5
    end
    I first used the command ineqdeco as follows:

    Code:
    ineqdeco overallaverage if year<=2000, by(regionid)
    I would repeat this to cover 5 year intervals however, I believe the Gini Coefficient results show the inequality distribution in each of the subgroups themselves rather than showing between group inequality. I then found this the command ineqdecgini and did the following code:

    Code:
    ineqdecgini overallaverage if year<=2000, by(regionid)
    Once again, I've repeated this for 5 year intervals to cover the timespan of my dataset. This command gives a "Gini-Between" result which I believe is the Gini Coefficient between regions for that specified time period. Am I interpreting this correctly and is there a formula for how this result is computed?

    Any help or further insight would be greatly appreciated.

    Many Thanks,
    Michael
    Last edited by Michael Chapman; 28 Mar 2021, 06:25. Reason: ineqdeco ineqdecgini

  • #2
    Suppose we have a population of units (of UK households or UK individuals, say) and we know the year of observation and the region the unit lives in. For a given year, 'Between region inequality' would be conventionally defined as the inequality that would arise were each unit attributed the average house price of the region to which the unit belongs. This latter concept is not the same as what your commands are producing. [Look at the formulae in the -ineqdeco- help-file.] I don't know what you're actually trying to achieve.

    Your use of the commands ignores the different number of units in each region; i.e. gives equal weights to each region. If you had information about the number of units (whatever the relevant definition is for your analysis), there'd probably be a way of weighting the data using that information

    Comment

    Working...
    X