Unconditional Quantile Regression-Counterfactual decomposition analysis

Samira Choudhury

Join Date: Sep 2016

Posts: 32
#1

Unconditional Quantile Regression-Counterfactual decomposition analysis

02 May 2019, 11:14

Dear STATA pros,

For my research, using the India NSS data, I'm observing the gap in consumption of major food groups by religion and employing RIF QR counterfactual decomposition methods.

A study by (Srinivasan, Chittur S.; Zanello, Giacomo; Shankar, Bhavani, 2013) https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3729423/ assesses the rural-urban differentials in HAZ scores by first estimating the distributions of HAZ scores separately for rural and urban children in each country using kernel smoothing techniques. I would like to generate figures like Figures 1 and 2 (shown on page 9) which show the cumulative distribution functions for urban and rural HAZ scores in Bangladesh and Nepal as well as aggregate the results of QR-CD analysis.

My question is how do you implement this on STATA? What is the command?

I have plotted the cumulative distribution functions for hindu and non-hindu milk consumption but I am struggling with how to plot the "counterfactual" curve on the same graph.

**plotting cumulative distribution function
cumul milk if hindu==1, gen(x)
cumul milk if hindu==0, gen(y)
stack x milk y milk, into(c milk) wide clear
line x y milk, sort

Thank you for your help.
Samira.
Tags: None

FernandoRios

Join Date: Apr 2014
Posts: 2471

02 May 2019, 11:39

Hi Samira,
there are a few ways to get what you are thinking about, and some of them involve the use of UQR procedures. It all depends on what would you consider to be your counterfactual.
Here is a small example

Code:

use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear
** Say that you want to do this by Gender
** First you need your 
probit female c.(educ exper tenure)##c.(educ exper tenure)
predict prfem
drop if prfem==.

sum educ exper tenure if female==0
** two different counterfactuals:
** Women with characteristics that look like men
sum educ exper tenure if female==1 [w=(1-prfem)/prfem]
** Men with characteristics that look like women
sum educ exper tenure if female==0 [w=prfem/(1-prfem)]
sum educ exper tenure if female==1

pctile men_wage=lnwage if female==0, n(100)
pctile women_wage=lnwage if female==1, n(100)
pctile men_x_with_women_b=lnwage if female==0 [w=prfem/(1-prfem)], n(100)
gen n=_n 
replace n=. if n>100
two line n men_wage, sort || line n women_wage  , sort|| line n men_x_with_women_b , sort legend(order(1 "Men" 2 "Women" 3 "Counterfactual")) xtitle(Log Wages) ytitle(Perncentile)

Regarding a more formal way to derive the decomposition, perhaps you want to take a look at "oaxaca_rif", which you can install using "ssc install oaxaca_rif"
HTH

Comment

Samira Choudhury

Join Date: Sep 2016

Posts: 32
#3

02 May 2019, 11:55

Hi,

Thank you so much for your response. Yes, I have used oaxaca_rif command to decompose milk consumption differences in hindu and non-hindu households for quantiles (10, 25, 50, 75 and 90) but I am still not sure how to replicate Figure 1 by Srinivasan et a. (2013). Also, in that paper, how would you do further decomposition of the covariate and coefficient effects into the contribution of individual covariates shown in Tables 4 and 5?

For instance, running the following command will show the decomposition for the lowest quantile.

oaxaca_rif milk logmpce hh_size hheduc femhh rural agri [aw=hhwt], cluster (FSU_Serial_No) by(hindu) wgt(1) rif(q(10))

But I don't know what command they have used for further decomposition of the covariate and coefficient effects of individual characteristics?

Many thanks for your help,
Samira.
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2471
#4

02 May 2019, 12:08

Im glad you are finding the program useful.
I m not sure, however, why is it that you do not see the desired results. Oaxaca_rif should automatically give you the detailed decomposition.
Can you share the exact results you are obtaining?
Thank you
Comment
Samira Choudhury

Join Date: Sep 2016

Posts: 32
#5

02 May 2019, 12:20

oaxaca_rif pcfruitvegg logmpce hh_size hheduc femhh rural agri [aw=hhwt], cluster (FSU_Serial_No) by(hindu) wgt(1) rif(q(10))
No Reweighted Strategy Choosen
Estimating Standard RIF-OAXACA using RIF:q(10)
Model : Oaxaca-Blinder RIF-decomposition
Type : Standard
RIF : q(10)
Scale : 1
Group 1: hindu= 0 N of obs 1 = 32022
Group c: x2*b1 N of obs C = .
Group 2: hindu= 1 N of obs 2 = 13353

------------------------------------------------------------------------------
pcfruitvegg | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
overall |
group_1 | 100.2119 1.973394 50.78 0.000 96.34407 104.0796
group_2 | 66.04037 2.416401 27.33 0.000 61.30432 70.77643
difference | 34.17148 3.098817 11.03 0.000 28.09791 40.24505
explained | 21.46223 2.478677 8.66 0.000 16.60411 26.32035
unexplained | 12.70925 4.160562 3.05 0.002 4.5547 20.8638
-------------+----------------------------------------------------------------
explained |
logmpce | 24.56972 2.803273 8.76 0.000 19.07541 30.06404
hh_size | -.329022 .1980491 -1.66 0.097 -.7171912 .0591471
hheduc | 2.964485 1.68645 1.76 0.079 -.3408963 6.269865
femhh | .0035343 .020232 0.17 0.861 -.0361197 .0431883
rural | -2.861022 1.317567 -2.17 0.030 -5.443406 -.2786387
agrihh | -2.099375 .5005418 -4.19 0.000 -3.080419 -1.118332
-------------+----------------------------------------------------------------
unexplained |
logmpce | -97.91773 47.87332 -2.05 0.041 -191.7477 -4.087752
hh_size | -3.547644 7.259225 -0.49 0.625 -17.77546 10.68018
hheduc | 10.14495 3.61851 2.80 0.005 3.052796 17.2371
femhh | .8236559 1.277081 0.64 0.519 -1.679377 3.326689
rural | -6.098612 6.621972 -0.92 0.357 -19.07744 6.880214
agrihh | -7.444178 3.552301 -2.10 0.036 -14.40656 -.4817967
_cons | 116.9414 54.00685 2.17 0.030 11.08992 222.7929
------------------------------------------------------------------------------

.
][/CODE]
Comment
Samira Choudhury

Join Date: Sep 2016

Posts: 32
#6

02 May 2019, 12:24

Thank you for your response. As you can see here, it shows the covariate effect of the 10th quantile for the "explained" part and coefficient effect for the "unexplained" part for each of the explanatory variables. But in that paper, in Table 4, they show the covariate effect and coefficient effect of these explanatory variables for both explained and unexplained parts. Hope I make sense!

Thank you,
Samira.
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2471
#7

02 May 2019, 12:42

I see.
Ok so to understand what that component is I would suggest you to read Firpo Fortin and Lemieux (2018) paper :https://www.mdpi.com/2225-1146/6/2/28/pdf-vor
and the paper that explains about the command, with couple of examples about it Rios-Avila (2019) http://www.levyinstitute.org/pubs/wp_927.pdf

In a nutshell, this components, which are referred to as unexplained on the paper you cite, is what FFL(2018) call the specification error and reweighing error. To obtain those results, you need to use a syntax similar to the following

Code:

oaxaca_rif lnwage educ exper tenure, by(female) wgt(1) rif(q(50)) rwlogit(educ exper tenure)

This uses the Reweighted RIF decomposition , rather than just the Oaxaca RIF decomposition.
HTH
Fernando
1 like
Comment
Samira Choudhury

Join Date: Sep 2016

Posts: 32
#8

03 May 2019, 01:48

Thank you so much Fernando!

Samira.
Comment
Samira Choudhury

Join Date: Sep 2016

Posts: 32
#9

03 Jul 2019, 10:03

Hi Fernando,

In the World Development paper by Cavatorta et al. (2015) https://www.sciencedirect.com/scienc...05750X15001655, which is about explaining cross -state disparities in child nutrition in rural India, in Figure 1, they have plotted density functions. Do you know the STATA code to plot this figure (using the oaxaca_rif command) - Tamil Nadu distribution vs Bihar distribution along with the counterfactual?

I'm doing a RIF counterfactual decomposition analysis using the NSS data for quantiles Q10-Q90 and it would be nice to demonstrate the results graphically.

Many thanks for your help,
Samira.
Comment

FernandoRios

Join Date: Apr 2014
Posts: 2471

#10

03 Jul 2019, 10:46

Hi Samira,
Unfortunately, there is no command that can produce those figures automatically. I tried write one, because graphs usually require a lot of detail when formatting, I desisted from the idea. In any case, bellow I provide you with one code that can be used to replicate the density figures.

Code:

webuse cattaneo2, clear
* treatment mbsmoke. Decomposition using Reweighted option
oaxaca_rif bweight prenatal1 mmarried mage fbaby, by(mbsmoke) w(0) rwprobit(mmarried c.mage##c.mage fbaby medu) rif(q(50))
** this probit is internally estimated when using oaxaca_rif with the rwprobit option
probit mbsmoke  mmarried c.mage##c.mage fbaby medu
predict pr

** this is equivalent to using w(0) in the oaxaca_rif command
gen wc1=(1-pr)/pr if mbsmoke==1
** this is equivalent to using w(1) in the oaxaca_rif command
gen wc2=pr/(1-pr) if mbsmoke==0

** Equivalent to figure 1
two kdensity bweight if mbsmoke==0 || kdensity bweight if mbsmoke==1 || kdensity bweight [aw=wc1] if mbsmoke==1, ///
            legend(order(1 "Non Smokers" 2 "Smokers" 3 "Counterfactual If smokers did not smoke"))
** Equivalent to figure 1 with alternative counterfactual
two kdensity bweight if mbsmoke==0 || kdensity bweight if mbsmoke==1 || kdensity bweight [aw=wc2] if mbsmoke==0, ///
            legend(order(1 "Non Smokers" 2 "Smokers" 3 "Counterfactual If nonsmokers did smoke"))
** Some people also like to use CDFs for comparing the distribution, so you can use the following
** Using CDFs
 cumul bweight if mbsmoke==0, gen(cdf0)
 cumul bweight if mbsmoke==1, gen(cdf1)
 cumul bweight if mbsmoke==1 [aw=wc1], gen(cdfc1)
 cumul bweight if mbsmoke==0 [aw=wc2], gen(cdfc2)
two line cdf0 bweight, sort || line cdf1 bweight, sort || line cdfc1 bweight, sort legend(order(1 "Non Smokers" 2 "Smokers" 3 "Counterfactual If smokers did not smoke"))
two line cdf0 bweight, sort || line cdf1 bweight, sort || line cdfc2 bweight, sort legend(order(1 "Non Smokers" 2 "Smokers" 3 "Counterfactual If nonsmokers did smoke"))

HTH
Fernando

Last edited by FernandoRios; 03 Jul 2019, 10:51.

Comment

Samira Choudhury

Join Date: Sep 2016
Posts: 32

#11

04 Jul 2019, 07:17

Hi Fernando,

Thank you so much!! This is extremely helpful. Sorry to bother you again but I have just one last question. Hope you can help me out.

I have used your command to generate reweighted RIF decomposition.
oaxaca_rif lnwage educ exper tenure, by(female) wgt(1) rif(q(50)) rwlogit(educ exper tenure) For my analysis, I need to calculate the contribution of individual characteristics to caste differences in vegetable consumption (similar to table 4 from the paper by Srinivasan et a. (2013)). I want to double check if I'm calculating the contributions correctly (I'm copy pasting the STATA output for one of the covariates - log per capita expenditure for the lowest quantile - Q10).

	(1)	(2)	(3)	(4)	(5)	(6)	(7)
VARIABLES	Overall	Explained	Pure_explained	Specif_err	Unexplained	Pure_Unexplained	Reweight_err

logmpce			29.611***	-28.232		-68.024	-1.242
			(3.168)	(39.698)		(71.985)	(2.221)

Group_1	108.014***
	(1.420)
Group_c	81.623***
	(2.991)
Group_2	74.248***
	(1.589)
Tdifference	33.766***
	(2.131)
ToT_Explained	26.391***
	(3.289)
ToT_Unexplained	7.375*
	(4.425)
Total		26.391***	7.375*
		(3.289)	(4.425)
Pure_explained		26.246***
		(4.559)
Specif_err		0.144
		(4.052)
Reweight_err			4.192
			(2.909)
Pure_Unexplained			3.183
			(4.005)

For the covariate effect, the contribution of log per capita expenditure is (29.611/26.246)*100= 112.8% For the coefficient effect, the contribution of log per capita expenditure is (-68.024/3.183)*100=-2137% Is this correct? Thank you so much for your help. Samira.

Comment

FernandoRios

Join Date: Apr 2014

Posts: 2471
#12

04 Jul 2019, 09:46

Hi Samira
I think your interpretation is technically correct. But as you already point out, the relative shares are too large (2000%). Perhaps would be better to refer to the detail decomposition in absolute terms rather than relative, to make a better interpretative of the results.
Fernando
Comment
Samira Choudhury

Join Date: Sep 2016

Posts: 32
#13

06 Jul 2019, 04:49

Thank you so much for your help Fernando. Is there any reason as to why I'm getting such large relative shares?
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2471
#14

06 Jul 2019, 05:10

There is no particular reason. nor reason for a problem
It simply means that within the coefficient and composition components some characteristics are counterbalancing each other.
For example, it is possible that say education has a positive contribution to the gap, whereas education has a negative contribution, but in total, they almost counteract each other. In a case like this, you will find those extremely large relative differences.
HTH
Fernando
Comment
Samira Choudhury

Join Date: Sep 2016

Posts: 32
#15

06 Jul 2019, 07:21

Thanks a lot for the explanation !!
Comment

Announcement