Recentered Influence Function (RIF) regression and decomposition

Ties Siebinga

Join Date: May 2021

Posts: 16
#16

02 Jun 2021, 10:22

Dear Prof. Fernando Rios,

Removing $xdemographic still returns the same error message.

I will submit a request to re-install rifhdreg - I hope this fixes the issue.

Thanks,
Ties
Comment
Ties Siebinga

Join Date: May 2021

Posts: 16
#17

15 Jun 2021, 06:33

Dear Prof. Fernando Rios,

I can confirm the issue was a result of not having the latest RIF command. Plotting the unconditional quantile regression coefficients now work very smoothly with the qregplot command - thank you.

I am struggling to find an interpretation of the unconditional quantile regression coefficients. How would you interpret these?

Many thanks and kind regards,
Ties Siebinga
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2470
#18

15 Jun 2021, 08:12

You may find this useful. Concnetrate on the section of UQR. While I make descriptions using Fixed effects, the same principles are used if no fixed effects are included.
https://osf.io/preprints/socarxiv/znj38/
Best wishes
Comment
Ties Siebinga

Join Date: May 2021

Posts: 16
#19

16 Jun 2021, 02:35

Dear Prof. Fernando Rios,

Thank you for providing this paper. Upon reading it I have another question. For my research I am interested in the impact of cancer, on earnings, across the distribution of income i.e. are low earners differently affected than high earners. Initially I thought that the UQR was the preferred method to investigate this however upon reading this paper it states that UQR is not appropriate to investigate treatment effect, as this is an example of large changes in the distribution of x. I now believe the Quantile treatment effect is the parameter of interest and I have read that it is possible to estimate this using the rifhdreg command. How can you do this?

I have panel data and the treatment effect is represented by the interaction of patient and post. This is an unconditional quantile regression but this does not estimate the QTE?

egen dd=group(Patient Year)
rifhdreg gross_income i.Patient##i.Post i.Year $xdemographic, over(dd) rif(q(20)) cluster(person_ID)

Hoping to gain some insight in how to apply the rifhdreg to obtain the quantile treatment effect.

Many thanks,
Ties
Comment
Ties Siebinga

Join Date: May 2021

Posts: 16
#20

16 Jun 2021, 03:05

The proposed QTE estimator can be implemented with the Stata community-contributed command –rifhdreg– utilizing the option -over-, to define the treatment variable, and the options -rwprobit- or -rwlogit- for the estimation of propensity scores.

because I am generating dd using Patient and Year (to exploit the panel structure?) rifhdreg believes I have more than 2 groups and so the estimator cannot be used. I have 9 time periods and 2 groups (Patient and control) as a result the command believes i have 18 groups. Please ignore my previous statement "This is an unconditional quantile regression but this does not estimate the QTE?" I believe the above does actually estimate the QTE due to the "over" option. My question, though, is how to use the QTE with panel data? alternatively I can ignore the panel structure and just do over(Patient) but this is not optimal.

Many thanks,
Ties
Comment
Ties Siebinga

Join Date: May 2021

Posts: 16
#21

16 Jun 2021, 03:18

Appologies for yet another message, however, I have found that the issue with Stata "detecting more than 2 groups" occurs only when the inverse probability weights are used. The purpose of IPW is used to control for differences in the distribution of characteristics across the two groups. Only after controlling for the differences in characteristics across the two groups can the treatment effect be estimated. This is same issue as above with stata believing there are 18 groups instead of 2.
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2470
#22

16 Jun 2021, 04:42

Hi Ties
ok, so you are on the right track. the command rifhdreg with over() option is what could allow you to estimate treatment effects. Now, The example I have on that paper covers something similar to what i think you are doing. The difference. My treatment variable is Motherhood.
Now, the "treatment" variable should be used within the "over" option. So ideally that would be a binary variable. In such case, rwlogit or rwprobit can be used.
It seems, however, that you have something more similar to a DID design. This is a bit more tricky, specially given the recent debate regarding the benefits and limitations of two-way fixed effects.
Setting that problem aside, I suggest the following:
1. Identify your treatment group and use it as the over() variable
2. Estimate the model as
rifhdreg y i.treated (other controls), over(treated) rif( q(50)) rwlogit(other controls)

This will estimate the reweighted treatment effect on the Q50th quantile. Its similar to what teffects does. But, as you already mention, it will not take into account the panel information.

3, For panel data, what I suggest on my paper is to add the fixed effects on the control list, but not on the "reweighted" section.

rifhdreg y i.treated (other controls), over(treated) rif( q(50)) rwlogit(other controls) abs(id )

I would estimate both, (with and without absorbing ID), and see how sensitive are the results. Do not forget that in both cases, bootstrap standard errors (with cluster) is needed here. And to check if the propensity score is well defined (or that you don't have too small or to large predicted probabilities)

However, there is no clear consensus on how should panel data be treated when estimating distributional treatment effects. A paper by Callaway and Li (2018) https://onlinelibrary.wiley.com/doi/full/10.3982/QE935 proposes a strategy I have yet to implement on my own.

4. If all your patients are treated at the same time, you could consider using -rifhdreg- for small 2x2 DD . And then use that to make assessments on short and long term effects of the treatment. Dong this, you may be able to use a variable with 4 values on "over" and use rwmlogit(). This would estimate something similar to ATE. But, if you want to use something closer to a ATT, you may want to create your own weights, without using the rwlogit or rwmlogit option

egen trtps=group(treated post)

** some way to create IPW
mlogit trps $controls
predict p*
gen ipw=1*(trtps==1)+p2/p1 * (trtps ==2)+....
This will allow you to get something like ATT with the following code

rifhdreg y i.treated##i.post (other controls) [aw=ipw], over(trtps) rif( q(50))

Cant say more since i do not know the specifics of your research question, but hopefully this helps you a bit on understanding what can be done.

F
Comment
Ties Siebinga

Join Date: May 2021

Posts: 16
#23

16 Jun 2021, 05:33

Dear Prof. Fernando Rios,

Thank you for this message - it is really valuable information!!

I will do exactly as you recommend, estimating with and without individual level fixed effects. With regards to bootstrapping I have read that this should be done as a prefix, with regards to the cluster should this be at the patient group or individual level ID? the work from Betrand et al., suggest clustering at the level at which treatment occurs, in this case Patient group, however it seems more logical to cluster on the individual level. Applying this:

bootstrap, cluster(Patient/ID): rifhdreg y i.treated (other controls), over(treated) rif( q(50)) rwlogit(other controls) abs(id)

My patients are all treated at the same time, you mention that it is possible to use DiD on 4 values on "over" and use rwmlogit(). I assume the 4 values correspond to patient pre-treatment, patient post, control pre and control post-treatment? I am more interested in estimating the ATT (as opposed to the ATE) as such I will need to make my own weights as you recommend. I have a question in regards to your suggested code to make these weights, in particular the following line

gen ipw=1*(trtps==1)+p2/p1 * (trtps ==2)+....

I was wondering what the "p2/p1" refers to and how to continue this code for trtps==3 and trtp3==4

Again many thanks for the very insightful information that you provide!!

Thanks,
Ties
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2470
#24

16 Jun 2021, 06:28

Hi Ties,
Those "p's" are the predicted probabilities. of a person being in trt group K. and yes, trtps refer to the 4 states, pre-control, post control, pre treatment , post treatment. But you'll need to verify this.

As of right now, rifhdreg estimates ATE where the IPW are
gen ipw = (trtps==1)/p1 + (trtps==2)/p2+(trtps==3)/p3+(trtps==4)/p4
So, to get IPW that weights for an ATT, I think you just need to divide everything by P1.
assuming that trtps1 are the treated group at time 0. You may want to do this for trt at post period too, in which case the IPW needs to be adjusted

regarding clustering, Big literature there. One option is indeed to cluster at the "treatment group level" but that gives you 2 groups, and small clusters is a problem. Since you have panel data, i would suggest clustering at the patient ID level instead. That is what is done currently in many of the DID procedures.

For the bootstrap, if you use your own IPW, you will have to bootstrap BOTH the IPW procedure and the RIFhdreg procedure.
Since that would take time, my suggestion is to just skip the bootstrap for now, until you settle with a specification and story that will write, and then focus on the bootstrap (it can be problematic writing the Bootstrap).

Best
Fernando
Comment
Ties Siebinga

Join Date: May 2021

Posts: 16
#25

16 Jun 2021, 07:50

Dear Prof. Fernando Rios,

Many thanks for your response.

Slightly confused you say "to get IPW that weights for an ATT, I think you just need to divide everything by P1." I dont know what you mean by this? as in your code you divide each trtps by their predicted probabilities: gen ipw = (trtps==1)/p1 + (trtps==2)/p2+(trtps==3)/p3+(trtps==4)/p4

At the moment I have the following code

egen trtps=group(treated post)

***IPW for ATT**
mlogit trtps $controls
predict p*
gen ipw = (trtps==1)/p1 + (trtps==2)/p2+(trtps==3)/p3+(trtps==4)/p4

rifhdreg y i.treated##i.post (other controls) [aw=ipw], over(trtps) rif( q(50)) abs(id)

I am actually surprised that I do not need to directly specify the att option in the rifhdreg comand above? I do not understand how constructing my own weights enables the rifhdreg command to estimate the ATT as opposed to the standard ATE. Also to confirm, the quantile treatment effect of the treated is provided by the interaction of treated with post? Do you know where I can find the formal assumptions which need to hold in order for consistent/unbiased estimates of the quantile treatment effect of treated? reading the paper you placed above from Callaway and Li they mention an extension of the basic common trend assumption of DiD, namely that the idea of “parallel trends” must hold on average to the entire distribution instead of just holding on the mean. Is the method identified by Callaway and Li at all even similar to that above? I am interpreting the above as the quantile treatment effect on the treated with DiD using 2 time periods and two groups, is this correct? Also the "mlogit trtps $controls" requires alot of itterations, it is not concave, I guess I will have to specify the maximum number of allowed itterations.

Thanks again,
Ties
Comment
Ties Siebinga

Join Date: May 2021

Posts: 16
#26

16 Jun 2021, 08:15

trtps==1 control pre, trtps==2 control post, ==3 patient pre and ==4 patient post.

The itterations for mlogit have now converged and I have run the code above, including fixed effects at individual level results in an error "insufficient observations" which is quite strange as there are ~10,000 individuals although I guess the above does estimate at a specific quantile, without the individual level fixed effects there is no problem. My only uncertainties regard the estimation of IPW and how this is used to provide the ATT.

Thanks,
Ties
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2470
#27

16 Jun 2021, 09:15

My apologies.
I actually meant Multiply it by p`k', where p`k' is either the treated group Before Treatment, or after treatment.

So it would be:

gen ipw = (trtps==1) + (trtps==2)*p1/p2+(trtps==3)*p1/p3+(trtps==4)*p1/p4

if you want the Treatment effect for group 1. (the idea is to weight all other groups so they look like those of the group of interest.
Comment
Ties Siebinga

Join Date: May 2021

Posts: 16
#28

16 Jun 2021, 13:08

Hi Fernando,

Ah I see, thank you. Why would you want the treated group before treatment? Pre-treatment no treatment effect is expected, should you not only be interested in the treatment effect for the patients post-treatment? This is group 4, so if i am correct it corresponds to the following

gen ipw = (trtps==1)*p4/p1 + (trtps==2)*p4/p2+(trtps==3)*p4/p3+(trtps==4)

Just to clarify using this IPW with the above provides the quantile treatment effect on the treated (i.e. the quantile equivalent of the ATT)?

Thanks,
Ties
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2470
#29

16 Jun 2021, 13:19

If you are using the same individuals, the effect using the treated group before the treatment, and the treated group after the treatment should be the same.
other than that yes, I believe that Identifies the Att
Comment
Ties Siebinga

Join Date: May 2021

Posts: 16
#30

17 Jun 2021, 01:31

Dear Prof. Fernando Rios,

Ok perfect - many thanks for all your help so far!!

My last question is with regards to the assumptions that need to hold for the estimation of the quantile treatment effect - do you know where I can find these?

Many thanks,
Ties
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment