Quantile regression with factor variables and clustered errors in Stata 14.2

Barbora Sedova

Join Date: Apr 2017

Posts: 63
#1

Quantile regression with factor variables and clustered errors in Stata 14.2

18 Jan 2019, 07:52

Dear all,

is it possible to conduct a quantile regression with factor variables and clustered standard errors in Stata 14.2?

xi: qreg2 is not working for me
Tags: None
Barbora Sedova

Join Date: Apr 2017

Posts: 63
#2

18 Jan 2019, 08:04

Also, I should point out that I do not have any time variation.
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2531
#3

18 Jan 2019, 08:26

Why is qreg and qreg2 not working for you?
can you provide more information ?
Comment
Barbora Sedova

Join Date: Apr 2017

Posts: 63
#4

18 Jan 2019, 08:29

I think qreg does not allow for clustering and qreg2 does not allow for factor variable?
I tried to run the following regression with both commands:

qreg2 Yln c.T_ei i.STATEID vce(cluster DISTRICT)
factor variables and time-series operators not allowed

xi: qreg2 Yln c.T_ei i.STATEID vce(cluster DISTRICT)
i.STATEID _ISTATEID_1-33 (naturally coded; _ISTATEID_1 omitted)
factor variables and time-series operators not allowed

qreg Yln c.T_ei i.STATEID vce(cluster DISTRICT)
variable vce not found

Any idea what I am doing wrong?
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2531
#5

18 Jan 2019, 08:32

Drop the “c.”in front of T_ei and you should be fine
Comment
Barbora Sedova

Join Date: Apr 2017

Posts: 63
#6

18 Jan 2019, 08:42

I tried that one too and it does not work:

qreg Yln T_ei i.STATEID, vce(cluster DISTRICT)
option vce( cluster DISTRICT) is not allowed

qreg2 Yln T_ei i.STATEID , vce(cluster DISTRICT)
factor variables and time-series operators not allowed

xi: qreg2 Yln T_ei i.STATEID, vce(cluster DISTRICT)
i.STATEID _ISTATEID_1-33 (naturally coded; _ISTATEID_1 omitted)
option vce() not allowed

Maybe the problem is that my Stata is 14.2 version??
Comment
Barbora Sedova

Join Date: Apr 2017

Posts: 63
#7

18 Jan 2019, 08:46

Ok this turns out to work: xi: qreg2 Yln T_ei STATEID, cluster( DISTRICT) !!
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3063
#8

18 Jan 2019, 08:59

Glad you sorted it out; the problem was that you were using the wrong option for the clustering.

Best wishes,

Joao
Comment
Barbora Sedova

Join Date: Apr 2017

Posts: 63
#9

18 Jan 2019, 09:20

Thanks a lot!
Also, I would like to estimate the mdoel for 5 quantiles.
This, however, does not work xi: qreg2 Yln T_ei STATEID, cluster( DISTRICT) q(.2 .4 .6 .8 1)

I suppose I need to do it one by one, no?

I am also wondering whether anyone knows how to report the results and spit them out into a latex code?
Many thanks!
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3063
#10

18 Jan 2019, 14:46

Yes, you need to do them one by one.

Best wishes,

Joao
Comment
Sven-Kristjan Bormann

Join Date: Jul 2018

Posts: 310
#11

19 Jan 2019, 07:22

You can estimate quantile regressions simultaneously for different quantiles using -sqreg- instead of -qreg-. However -sqreg- cannot deal with clustered standard errors on its own.
For your example, the code would look something like this

Code:

bs ,cluster(DISTRICT) reps(50): sqreg Yln T_ei STATEID, q(.2 .4 .6 .8) reps(10)

This approach might take longer depending on your dataset, because the estimations are repeated many times. In the above setting, the estimation will be repeated 500 times.
Out of coding laziness, I prefer -sqreg- over the normal -qreg- or -qreg2-, because I normally run my estimations of the deciles of a distribution. -sqreg- tends to be a bit faster in these settings.
If there are not too many observations per cluster, then the difference between clustered and non-clustered standard errors can be small for large dataset.

Another remark: You cannot specify q(.2 .4 .6 .8 1), but only q(.2 .4 .6 .8).
Note also, that you do not estimate the difference between the quintiles, but only roughly speaking the cumulative effect. Your estimation for the 40% quantile will include the observations for the 20% quantile as well.

When it comes to LaTeX code, there are some packages to generate LaTeX code from estimation results. Just run - net search latex- and pick one of them according to your needs. The -estout- package by Ben Jann can also write LaTeX code.
The other alternative is to export the results to Excel and then use the the Excel2LaTeX addin to convert the Excel tables into LaTeX.
1 like
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#12

19 Jan 2019, 07:46

Barbora Sedova , you are violating pretty much all guidelines on how you should post on Statalist. You are not showing a sample of your data with -dataex-, you are not showing which command you executed and what exactly Stata returned to you as output (but instead you are declaring that something "does not work" or that something "has worked").

The guidelines on how to post on Statalist are there for a reason. When one does not follow them, he/she feeds garbage into the Statalist system, and through the iron law "garbage in, garbage out" he/she most likely will obtain in return garbage from Statalist.

I think that it is what might have happened in this thread, we might have verified the iron law "garbage in, garbage out".

1. What you declared that "works" in #7 (xi: qreg2 Yln T_ei STATEID, cluster( DISTRICT)) is not doing what you want it to do. This is not expanding STATEID in a set of dummy variables, this is including STATEID linearly in your regression.

2. I do not know what the cryptic remark in #2 that "you do not have time variation" is supposed to mean, but it is not clear at all how your data looks like. E.g., are STATEID nested within DISTRICTS or the other way round? At what level is your observations (individual, state, district, etc.) and what are the variables included?
1 like
Comment
Jackie Kleynhans

Join Date: Mar 2017

Posts: 6
#13

22 Apr 2021, 10:48

Originally posted by Sven-Kristjan Bormann View Post

You can estimate quantile regressions simultaneously for different quantiles using -sqreg- instead of -qreg-. However -sqreg- cannot deal with clustered standard errors on its own.
For your example, the code would look something like this

Code:

bs ,cluster(DISTRICT) reps(50): sqreg Yln T_ei STATEID, q(.2 .4 .6 .8) reps(10)

This approach might take longer depending on your dataset, because the estimations are repeated many times. In the above setting, the estimation will be repeated 500 times.
Out of coding laziness, I prefer -sqreg- over the normal -qreg- or -qreg2-, because I normally run my estimations of the deciles of a distribution. -sqreg- tends to be a bit faster in these settings.
If there are not too many observations per cluster, then the difference between clustered and non-clustered standard errors can be small for large dataset.

Another remark: You cannot specify q(.2 .4 .6 .8 1), but only q(.2 .4 .6 .8).
Note also, that you do not estimate the difference between the quintiles, but only roughly speaking the cumulative effect. Your estimation for the 40% quantile will include the observations for the 20% quantile as well.

When it comes to LaTeX code, there are some packages to generate LaTeX code from estimation results. Just run - net search latex- and pick one of them according to your needs. The -estout- package by Ben Jann can also write LaTeX code.
The other alternative is to export the results to Excel and then use the the Excel2LaTeX addin to convert the Excel tables into LaTeX.

Sven-Kristjan Bormann thank you for your solution. I was sitting with a similar issue and was just about to give up. Your suggestion worked perfectly. Just a question on methodology here, how is clustering dealt with within the bootstrap function? So for example if I use svy, it uses Taylor-linearised variance estimation and I can mention that in my methods section. How would I describe it in this instance?
Comment

Announcement

Quantile regression with factor variables and clustered errors in Stata 14.2

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment