Interpreting inequality decomposition results

Hanna Szymborska

Join Date: Jul 2017

Posts: 1
#1

Interpreting inequality decomposition results

06 Jul 2017, 11:27

Hello,

I am trying to apply inequality decomposition techniques to the US Survey of Consumer Finances. I am using packages ineqfac (following Shorrocks 1982) and ineqrbd (following Fields 2003).

I have four questions:

1) What is the correct interpretation of binary variables in decomposition analysis? E.g.

ineqrbd logY age educ Dhhsex blackhisp Dmarried kids Dse Dunemp wageinc bussefarminc intdivinc kginc ssretinc transfothinc debtpay [fw=fwt] if year==1989, noconstant noregr i2

Regression-based decomposition of inequality in logY
---------------------------------------------------------------------------
Decomp. | 100*s_f S_f 100*m_f/m I2_f I2_f/I2(total)
---------+-----------------------------------------------------------------
residual | 10.1209 0.0004 1.9958 25.6742 5826.4009
age | -11.4448 -0.0005 38.1775 0.0659 14.9631
educ | 66.9315 0.0029 53.6437 0.0348 7.9077
Dhhsex | 14.6494 0.0006 -2.0987 1.2746 289.2482
blackhisp| -8.9524 -0.0004 1.3629 1.9308 438.1765
Dmarried | -11.6565 -0.0005 1.9960 0.6976 158.3160
kids | 9.1816 0.0004 4.3095 0.8291 188.1444
Dse | 0.3671 0.0000 0.0635 4.0128 910.6409
Dunemp | 8.3495 0.0004 -1.3938 1.2579 285.4603
wageinc | 21.5828 0.0010 2.1922 1.2276 278.5796
bussefarminc| -0.0430 -0.0000 -0.0022 563.7818 1.28e+05
intdivinc| -1.2066 -0.0001 -0.0659 72.3670 1.64e+04
kginc | 0.5308 0.0000 0.0133 582.2035 1.32e+05
ssretinc | -0.6790 -0.0000 -0.4728 3.6536 829.1441
transfothinc| 0.0037 0.0000 0.0025 132.0923 3.00e+04
debtpay | 2.2650 0.0001 0.2764 3.8934 883.5458
---------+-----------------------------------------------------------------
Total | 100.0000 0.0044 100.0000 0.0044 1.0000
---------------------------------------------------------------------------

What would be the correct interpretation of contribution of a binary variable such as household sex (Dhhsex)?

2) Why are the results volatile to logging the dependent var? Is it preferred to use log transformation?

ineqrbd income age educ Dhhsex blackhisp Dmarried kids Dse Dunemp wageinc bussefarminc intdivinc kginc ssretinc transfothinc debtpay [fw=fwt] if year==1989, noconstant noregr

Regression-based decomposition of inequality in income

Decomp. 100*s_f S_f 100*m_f/m CV_f CV_f/CV(total)

residual 30.8970 1.2925 -0.8500 -273.3506 -65.3455
age 0.0014 0.0001 2.0031 0.3629 0.0867
educ -0.3620 -0.0151 -50.5724 -0.2640 -0.0631
Dhhsex -0.0192 -0.0008 0.5979 1.5984 0.3821
blackhisp 0.0140 0.0006 -0.4948 -1.9605 -0.4687
Dmarried -0.2489 -0.0104 9.9806 1.1798 0.2820
kids -0.0453 -0.0019 -8.4874 -1.2887 -0.3081
Dse 0.2869 0.0120 3.8114 2.8257 0.6755
Dunemp -0.0973 -0.0041 3.8557 1.5863 0.3792
wageinc 5.8093 0.2430 45.6681 1.5716 0.3757
bussefarminc 0.2985 0.0125 0.4484 33.6926 8.0543
intdivinc 7.6039 0.3181 7.9572 12.0565 2.8821
kginc 30.1608 1.2617 6.2378 34.1966 8.1748
ssretinc 0.2871 0.0120 9.5187 2.7098 0.6478
transfothinc 0.2246 0.0094 1.1503 16.2856 3.8931
debtpay 25.1892 1.0537 69.1754 2.7936 0.6678

Total 100.0000 4.1832 100.0000 4.1832 1.0000

3) If square terms are included, what would be their interpretation?

. ineqrbd logY age age2 educ Dhhsex blackhisp Dmarried kids kids2 Dse Dunemp wageinc bussefarminc intdivinc kginc ssretinc transfothinc debtpay [fw=fwt] if year==1989, noconstant
> noregr i2

Regression-based decomposition of inequality in logY
---------------------------------------------------------------------------
Decomp. | 100*s_f S_f 100*m_f/m I2_f I2_f/I2(total)
---------+-----------------------------------------------------------------
residual | 21.6590 0.0010 0.8227 74.3799 1.69e+04
age | -41.5669 -0.0018 138.6583 0.0659 14.9631
age2 | 62.9746 0.0028 -67.4030 0.2414 54.7799
educ | 31.1174 0.0014 24.9398 0.0348 7.9077
Dhhsex | 6.9642 0.0003 -0.9977 1.2746 289.2482
blackhisp| 0.6091 0.0000 -0.0927 1.9308 438.1765
Dmarried | -1.2042 -0.0001 0.2062 0.6976 158.3160
kids | 5.6279 0.0002 2.6416 0.8291 188.1444
kids2 | -2.3810 -0.0001 -1.1313 1.9253 436.9110
Dse | 0.5107 0.0000 0.0883 4.0128 910.6409
Dunemp | -1.8055 -0.0001 0.3014 1.2579 285.4603
wageinc | 14.8888 0.0007 1.5123 1.2276 278.5796
bussefarminc| 0.1004 0.0000 0.0050 563.7818 1.28e+05
intdivinc| 0.1786 0.0000 0.0098 72.3670 1.64e+04
kginc | 0.6638 0.0000 0.0167 582.2035 1.32e+05
ssretinc | 0.3625 0.0000 0.2524 3.6536 829.1441
transfothinc| 0.0202 0.0000 0.0140 132.0923 3.00e+04
debtpay | 1.2804 0.0001 0.1562 3.8934 883.5458
---------+-----------------------------------------------------------------
Total | 100.0000 0.0044 100.0000 0.0044 1.0000
---------------------------------------------------------------------------

Since the results change quite a bit, does it make sense to include square terms for decomposition?

And finally:

4) I have obtained drastically different results for factor decomposition using ineqrbd and ineqfac. As can be seen above, ineqrbd returns larger contrubution of wageinc (wage income) over e.g. business income (bussefarmic). But this result is reversed with ineqfac:

ineqfac wageinc bussefarminc intdivinc kginc ssretinc transfothinc [fw=fwt] if year==1989, i2

Inequality decomposition by factor components

Factor 100*s_f S_f 100*m_f/m I2_f I2_f/I2(Total)

wageinc 6.8806 0.7312 65.3736 1.2350 0.1162
bussefarminc 67.5214 7.1760 11.0296 567.5948 53.4071
intdivinc 5.0297 0.5345 6.4171 72.6795 6.8387
kginc 18.8564 2.0040 5.4555 584.7025 55.0169
ssretinc 0.1547 0.0164 8.1420 3.6715 0.3455
transfothinc 1.5572 0.1655 3.5822 132.6103 12.4778

Total 100.0000 10.6277 100.0000 10.6277 1.0000

I tried to read up on that but it is still not clear to me why the results are so different. I noted that when using income instead of log of income in ineqrbd (see point 2) the results of ineqrbd and ineqfac are more consistent.

Thank you for your help!
Tags: None
Tamaryn Friderichs

Join Date: Sep 2019

Posts: 7
#2

11 Sep 2019, 02:17

Hi, Did you get any help on this? I have exactly the same queries. Many thanks!
Comment
Chris Elbers

Join Date: Jun 2021

Posts: 1
#3

10 Jun 2021, 07:59

Let me try to answer the questions in this post.

Starting with questions 2 and 3: that you find different regression
results should not come as a surprise. If you change the dependent
variable or change a specification of a regression equation you can
expect to get different results. In general statistician would favour
regressions with normally distributed residuals and error terms with
constant variance (i.e., homoskedasticity). One could try to find a
transformation of the dependent variable and functional forms for the
independent variables to achieve that. But one shouldn't expect
regression results to be insensitive to such transformations.

As for log transformation of income, that is almost universally done
in income regressions. Likewise, adding a squared independent variable
to the regression is intended to take possible nonlinearities into
account, and also very common practice.

My answer to question 1:

The contribution to inequality of 'y' (say, log income), attributed to 'x', as computed by ineqrbd (not using
the Fields option) equals

s_f = b * cov(x,y) / var(y),

where b is the regression coefficient for x from the underlying
regression. Now let x be a dummy variable, defined as being a female
respondent (x=1) or not (x=0). Then the covariance above amounts to

cov(x,y) = share(female respondents) times ([average y of female respondents] - [average y of all respondents]).

Take, for example, the case where b < 0 (so, other things equal, being
female tends goes with lower y). Then if also average y of females is
lower than the overall average, both b and cov(x,y) are negative so
that s_f is positive: a positive share of y-inequality can be
attributed to y-differences between females and non-females.

On the other hand, let average female y be higher than overall average y, while
still b < 0. Then s_f is negative and a negative share of inequality
can be attributed to 'femality'. This makes sense. Consider adding a
respondent to the survey with average characteristics for the
variables other than x. Then if this respondent is female, including
her reduces (since b < 0) average y among female respondents more than
that it reduces the overall average of y. The result is a reduction of
y-inequality.

So to the posted question, or the more general question 'What is the
correct interpretation of categorical variables in decomposition
analysis?' I would answer: 'the degree to which average income
differences across the different categories contribute to or detract
from overall inequality, over and above the contributions of the other
factors that are considered'.
1 like
Comment

Announcement

Interpreting inequality decomposition results

Comment

Comment