Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Quantile Regression with Clustered Standard Errors and Factor Variables

    Hello everyone,

    For my thesis, I'm trying to do a quantile regression on an income variable called yearlyincome for 2 groups separately. I would like to do this quantile reg using both factor variables and clustered standard errors, as I'm using panel data.

    From what I know, there are multiple options for this:

    - Using qreg command: this allows for quantiles and factor variables, but not clustered standard errors.
    - Using qreg2 command: this allows for quantiles and clustered standard errors, but not factor variables.
    - Using reg: which allows for quantiles, clustered standard errors and factor variables. The normal reg command therefore seems favorite, but you would need to separate your income variable with the xtile command into quantiles first.

    So basically, I would like to use the normal reg command to do this as it allows everything I want (factor vars and clustered standard errors). I used the xtile command to divide my data into 4 quantiles and that's all fine.

    The problem is, I wanted to see if I was doing things right and decided to run both a quantile regression using the qreg command, as well as a quantile regressions using the reg command and doing it on one of the created quantiles. So for clarification, I first created a variable to indicate quantiles:

    Code:
    xtile Quantiles = yearlyincome, nq(4)
    Then, I ran both regressions:

    Code:
    reg yearlyincome age88 age88sq i.civstatus i.education potworkexp if Quantiles==1
    
    qreg yearlyincome age88 age88sq i.civstatus i.education potworkexp, quantile(.25)
    This should yield the same results right? I figured if I created 4 quantiles (I want the .25th percentile, the .50th, and .75th), quantile no. 1 that is created with the xtile command is the .25th percentile right? Hence I thought the above command should give the exact same results, as I'm estimating the first quantile in both.

    However, for some reason they don't. The thing is, I have a valid sample size of 13,713 observations. The reg command only takes 2,611 observations into account, while the qreg command takes into account all 13,713 observations.

    So from my thinking, that's the reason they're not yielding the same results. But why not? If anyone could tell me, that would be greatly appreciated!

    Best,

    Last edited by Arne RW; 12 Jun 2015, 09:10.

  • #2
    Dear Arne,

    I am afraid you are making a very common mistake. The command -reg- runs an OLS regression which estimates (an approximation to) the conditional mean. The commands -qreg- and -qreg2- run quantile regressions which estimate (an approximation to) a conditional quantile. The important thing to keep in mind is that a conditional quantile is not the conditional mean of some sub-sample of your data. In fact, quantiles have nothing to do with means, except that they are all provide information on the location of the distribution. Therefore, you really cannot use -reg- to estimate quantiles.

    So, if you want to estimate a quantile regression with clustered standard errors you will have to use -qreg2-. You are right in saying that it does not allow you to use factor variables, but that is not a problem because you can start by using -xi- to create all the variables you need, and then use -qreg2- with the variables you just created. Alternatively, you can simply use

    Code:
    xi: qreg2 yearlyincome age88 age88sq i.civstatus i.education potworkexp, quantile(.25)
    Hope this helps and thanks for your interest in -qreg2-.

    Joao

    Last edited by Joao Santos Silva; 12 Jun 2015, 14:59.

    Comment


    • #3
      Dear Joao,

      Thanks very much for your quick and helpful reply.

      Now that you mention it, it indeed seems unlogical to use the -reg- command for estimating the quantiles since it estimates means. I was thinking I could use it because I first separated the yearlyincome variable into quantiles. But if I understand it right now, using the -reg- command with an if statement to run the regression on the right quantile, it would still estimate the means of that quantile, right?

      I've just tried using the -xi- command together with the -qreg2- command and it works perfectly fine.

      Thanks again for your help - I didn't know you were on of the creators of -qreg2-, but good job!

      Arne

      Comment


      • #4
        Dear Arne,

        What you where doing was estimating the conditional mean for the observations in a certain unconditional quantile, which is unlikely to provide interesting or meaningful results. Surprisingly, a lot of people think that is how quantile regression is performed.

        Thanks for your feedback on -qreg2- I am glad you found it useful. If you end up using the clustered standard errors, please cite:

        Parente, P.M.D.C. and Santos Silva, J.M.C. (2016), Quantile Regression with Clustered Data, Journal of Econometric Methods, forthcoming.

        Best of luck,

        Joao

        Comment


        • #5
          Dear Joao,

          Like you said, many people misinterpret quantile regression. I myself have some trouble interpreting the results as well.

          Let's assume, using the formula you provided above, I get coefficients like this (I've left some out as interpretation should be the same for all) for the .25 quantile, assuming dollar amounts:
          Code:
          _cons 65,932
          age88 -4,681
          civstatus 19,488
          potworkexp 6,290
          How would I properly interpret these results (without considering significance, standard errors or CIs)? Does it mean people at the .25th percentile have a yearly income of $65,932, when the rest of the variables are equal to zero? And a one-unit change in civstatus would increase this yearly income by $19,488 for people at the .25h percentile? Or does that count for people up to and including the .25th percentile, in other words, entire range from 0 - .25th percentile? I would guess these are the estimates for people at at .25th percentile, but I would like to be sure.

          Thanks in advance, and I will cite you properly in my final thesis.

          Comment


          • #6
            Dear Arne,

            Think about how you would interpret the OLS results: what you get are estimates of (an approximation to) the conditional mean, and the intercept tells you the location of the mean when the regressors are zero and the slopes tell you how the mean changes when you change the regressors. So, all of this if about the mean of the distribution, not about people at the mean (for instance, for a binary variable the mean is generally between 0 and 1, and therefore the variable is never equal to the mean).

            For quantile regression the reasoning is the same: you get information about where the quantile is located and how it shifts with the regressors. So, it is not about people in a certain region, is about the location of the quantile.

            One final note about terminology: for a continuous variable, the quantile is a point, not a range. For example, for a uniform (0,1) variable, the first quartile is 0.25, not the observations between 0 and 0.25, OK?

            All the best,

            Joao

            Comment


            • #7
              Dear Joao,

              I would like to know how to interpret the Parente-Santos Silva test for intra-cluster correlation. What do I need in this test? Do I want to accept or reject the null hypothesis? becuase it is not clear to me. I would appreciate yor answer.

              Thanks in advance

              Comment


              • #8
                Hi Marcos,

                In the future, please open a new thread for a new question, OK?

                Anyway, the null hypothesis of the test is that there is no intra-cluster correlation. Whether you want to accept or not is up to you

                Please do let me know if you have further questions,

                Joao

                Comment


                • #9
                  Thanks a lot for your answer, I do not know if I should re ask in this post or open a new thread. I am analyzing the determinants of debt maturity for a set of European countries. Therefore i am using quantile regressions clustered by countries but I do not know from an econometric point of view if it is good or not the existence of intra cluster correlation in my case. In all the cases I obtain p values that reject the null hypothesis, so there exists intra cluster correlation and I do not know how to interpret that in my case.

                  Thanks in advance

                  Comment


                  • #10
                    If you have intra-cluster correlation it is safer to use the cluster-robust standard errors. That's essentially what you have to do. You may also try to reduce or eliminate the correlation by changing the specification of the model, if that is feasible.

                    All the best,

                    Joao

                    Comment


                    • #11
                      Hello everyone

                      I also need help. I am completing panel quantile regression and I am only getting coefficients of the independent variables only. The pvalues, standard error and confidence intervals are not generated.

                      I used command . qregpd LogEmissions LogGDP LogGDP2 Indu manu Trade Apopn, id(Country) fix(Year) quantile(0.7)

                      Please help, is it the right command???

                      Comment


                      • #12
                        Dear Fortune,

                        You should open a new thread for this because you are talking about something totally different.

                        Best wishes,

                        Joao

                        Comment


                        • #13
                          Dear Joao

                          In #3. you mentioned xi:qreg2 code. can you suggest, what does this xi addition means here?

                          Comment


                          • #14
                            It allows the use of factor notation.

                            Comment


                            • #15
                              Good morning prof. Santo Silva. I have following codes for run quantile regression. But I try it to run for 0.1 quantile it returns error. Codes and error given below. I would be appreciate any help. Thank in advance!

                              global prodvars logY3a logY4a logY3asq logY4asq logY34a logY3apk logY4apk logY3apl logY4apl logY3aequity_ logY4aequity_ logpk logpl logequity_ logpksq logplsq logequity_sq logpkpl logequity_pk logequity_pl

                              global envars logOffBalance Z1a Z2 Z12 Z13 Z14 Z15 NPLsh2001 NPLsh2002 NPLsh2003 NPLsh2004 NPLsh2005 NPLsh2006 NPLsh2007 NPLsh2008 NPLsh2009 NPLsh2010 NPLsh2011 NPLsh2012 NPLsh2013

                              global annvars fukushima2002 fukushima2003 fukushima2004 fukushima2005 fukushima2006 fukushima2007 fukushima2008 fukushima2009 fukushima2010 fukushima2011 fukushima2012 fukushima2013 d2002 d2003 d2004 d2005 d2006 d2007 d2008 d2009 d2010 d2011 d2012 d2013

                              global ylist logtc

                              global xlist $prodvars $envars $annvars

                              describe $ylist $xlist
                              summarize $ylist $xlist

                              qreg2 $ylist $xlist, quantile (.1)

                              convergence not achieved.
                              VCE computation failed; try increasing the maximum number of iterations or try bsqreg




                              Comment

                              Working...
                              X