Quantile regression with clustered errors

Marcos Gonzalez

Join Date: Nov 2015

Posts: 36
#1

Quantile regression with clustered errors

23 Mar 2016, 17:41

Good evening,

I would like to ask a question about quantile regression with clustered standard errors. I have read the paper from Parente and Silva and I am using the command qreg2 in Stata to perform an analysis for a set of countries during a time span of 20 years. I understand that this methodology is the closest technique to a panel data estimation using quantile regression, and as I cluster standard errors by countries it is similar to a fixed effects estimation with panel data. Am I right? I really appreciate if anyone could explain me the methodology in simple words.I have been asked if this kind of analysis implies a pooling of regressions that are time series in nature, and I do not know how to answer this question.

Thanks in advance

Kind regards
Tags: None
Joao Santos Silva

Join Date: Apr 2014

Posts: 3011
#2

24 Mar 2016, 01:32

Dear Marcus,

Thank you for your interest in our work. I am afraid you are not right, what you are estimating is the quantile regression equivalent of pooled OLS with clustered standard errors.

Best regards,

Joao
Comment
Marcos Gonzalez

Join Date: Nov 2015

Posts: 36
#3

24 Mar 2016, 02:24

Thank you for your answer,

So can you explain what it means that the standard errors are clustered by a group variable? What is the difference between general quantile regression using commands qreg or sqreg, and your command qreg2?

The thing is that I would like to understand the methodology in order to explain it properly in my paper.

Thanks in advance
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3011
#4

24 Mar 2016, 11:45

Marcus,

As you will see, the estimates obtained with the 3 commands are the same. The difference is that -qreg2- allows you to compute "clustered standard errors".

In a panel, it is likely that the observations for each individual are correlated over time, although observations form different individuals are independent. Therefore, to compute a valid covariance matrix you need to take this structure into account, and that is what "clustered standard errors" do. In short, if you simply use -qreg- or -sqreg- the t-tests reported are generally invalid when you estimate the model with panel data; -qreg2- allows you to by-pass that problem.

Hope this helps but you should read about "clustered standard errors" in a good textbook.

Joao
Comment
Marcos Gonzalez

Join Date: Nov 2015

Posts: 36
#5

24 Mar 2016, 17:58

Thank you very much again for your helpful answer,

What I have understood is that the difference between regular quantile regression and the command qreg2 is the way they calculate standard errors

So, the thing is that I have data for a set of countries during a time span of 20 years. Therefore my data set has a panel structure, and as you have said, if I use common quantile regression using qreg or sqreg, the covariance matrix estimated is not valid. am I right? So, the most smart thing to do is to use qreg2 which allow to solve this problem. This is what I have understood from your answer and I hope I have understood it rightly.

I am working with this issue in my paper and I have been told this: “Does the analysis involve a pooling of regressions that are time series in nature?” and I do not know what should be the answer. Is the qreg2 command considering pooled regressions and taking into account that they are time series, as the data set has been established as panel data?

Could you recommend me any paper or textbook to understand how clustered standard errors are calculated, to better understand them?

Sorry for so many questions but I think that nobody can answer these questions better than one of the authors of this command and this methodology.

Kind regards
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3011
#6

25 Mar 2016, 01:32

Yes, that is broadly right and indeed your data does that pooling. I suggest you have a look any of the textbooks by Wooldridge or by Cameron and Trivedi.

Best of luck,

Joao
Comment
Marcos Gonzalez

Join Date: Nov 2015

Posts: 36
#7

25 Mar 2016, 04:15

Thank you very much for your answer again.

I have understood more or less everything to go on with my paper. Maybe I will ask you again some questions about this issue in the future if that does not bother you. Should I open a new post or can i follow this thread?

Thank a lot

Regards
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3011
#8

26 Mar 2016, 08:46

Sure. If it is on the same topic is it fine to use this thread, otherwise it is better to open a new one.

Best wishes,

Joao
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2158
#9

26 Mar 2016, 10:54

In my 2010 MIT Press textbook, Econometric Analysis of Cross Section and Panel Data, 2e, Section 12.10.3, I discuss various approaches to quantile regression with panel data. As an approximation to what one might mean by "fixed effects," one can use the Mundlak-Chamberlain device. Or, for median estimation, difference or use the withing deviations in a LAD estimation. Everything that we know how to do is an approximation. I tend to prefer quantile regression with the Mundlak device.

I might also (immodestly) point out that in the same Section 12.10.3, I suggested the use of the same clusterd standard errors as Parente and Santos Silva. (It did not appear in the first edition, 2002.) The material actually dates back to my NBER lectures with Guido Imbens starting in 2007. Of course, I didn't do the hard work of verifying the regularity conditions. :-)

NBER 2007
1 like
Comment
Marcos Gonzalez

Join Date: Nov 2015

Posts: 36
#10

28 Mar 2016, 07:48

Thank you for your answer Mr Wooldridge,

I have read the document you attached. It is possible to apply any of these techniques (Mundalk approach for instance) in STATA?, If it is not possible I assume that the better choice is to apply the Parente and Santos command using clustered standard errors.

Regards
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2158
#11

28 Mar 2016, 10:01

It's pretty easy to use the Mundlak device along with the Parente/Santos Silva software. You need to compute the time averages by country for the time-varying explanatory variables.

Code:

egen x1bar = mean(x1), by(countryid) egen x2bar = mean(x2), by(countryid) ... egen xKbar = mean(xK), by(countryid) qreq2 y x1 ... xK x1bar ... xKbar z1 ... zJ d2 ... dT, q(.5) cluster(countryid)

z1 ... zJ are time-constant variables and d2 ... dT are the time dummies. Of course you can use any quantile you want.

JW
1 like
Comment
Marcos Gonzalez

Join Date: Nov 2015

Posts: 36
#12

29 Mar 2016, 04:16

Thank you very much for your answer Mr Wooldridge,

I have another question. When you use the Mundlak device with the code you have told me, which coefficients do you interpret from the results? The coefficients for the x1...xk variables, the coefficients for the x1bar....xKbar variables, or both? For instance, imagine that the coefficient for the x1 is not siginificant but for the x1bar it is highly significant, can I infer something from that?

Regards
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2158
#13

30 Mar 2016, 09:50

Marcos: That's not good news in the sense that it's essentially the same result that the usual FE estimates are insignificant. If you were use OLS rather than quantile regression, the coefficients on x1 ... xK would be identical to the FE estimates. Then, we would conclude that the heterogeneity is correlated with the covariates. You're finding that any effect you find of, say, x1 when not controlling for x1bar must be treated as spurious.

In the regression case, testing x1bar ... xKbar is the regression-based version of the Hausman test. So, you are rejecting pooled quantile regression in favor of the Mundlak approach. Sorry, but that's how it often works out: A variable is statistically significant using pooled OLS or RE, but not when you use FE. You are finding the analogous result for quantile regression using Mundlak.
Comment
Marcos Gonzalez

Join Date: Nov 2015

Posts: 36
#14

01 Apr 2016, 05:42

Thank you very much for your answer, it was very helpful

I have another question that maybe someone can answer me. Is there a way to choose the quantiles? How can I justify the selection of the quantiles? I would like to analyze how me dependent variable relates to the independent variables along all the distribution. Therefore I am using quantiles 0.05, 0.25. 0.5, 0.75 and 0.95? Is it correct? Should I choose another quantiles?

Thanks in advance
Comment
Marcos Gonzalez

Join Date: Nov 2015

Posts: 36
#15

06 Apr 2016, 06:03

Good morning Mr Santos Silva and Mr Wolldridge,

I am sorry for bothering both of you again but I have another question related with this topic. Can any of you tell me which is the mathematical expression for the calculation of normal standard errors, and clustered standard errors in the quantile regression? I would need a mathematical expression for this if it is possible.

Thank you very much in advance
Comment

Announcement

Quantile regression with clustered errors

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment