Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • A Question about Quintile Regression

    Hi,

    I want to run HLM in different quintiles of the dependent variable. I saw previous studies usually used 10th 25th 50th 75th and 90th as the quintile points. I'm wondering what's the range of such quintiles? Are they 0-10%, 10%-25%, 25%-50%, 50%-75%, 75%-90%?

    Besides, is it a kind of quantile regression? How should I describe this method?

    Thank you!

  • #2
    Breaks for quintiles are 20, 40, 60, 80% points in a distribution. Quintiles are a special case of quantiles.

    Comment


    • #3
      Originally posted by Nick Cox View Post
      Breaks for quintiles are 20, 40, 60, 80% points in a distribution. Quintiles are a special case of quantiles.
      Thank you so much! I really want to include 50% as one of the breaks. Is it better to use quartiles instead?

      Comment


      • #4
        Minda: You can do any quantiles you want. It’s natural to include the median no matter what other quantiles you use.

        Comment


        • #5
          Originally posted by Jeff Wooldridge View Post
          Minda: You can do any quantiles you want. It’s natural to include the median no matter what other quantiles you use.
          Thank you Jeff! I have a further question on this method: I know there is a specific program for quantile regression in stata that is "qreg". However, I want to combine HLM and Quintile regression. So, if I run HLM on each subgroup of the quintiles, shall I get the right estimates?

          Comment


          • #6
            Are you sure that that's legit? Fitting separate hierarchical models to subsets of data where the outcome variable lies between quantiles of its observed values would distort estimates of the variance components at the least, wouldn't it? Run the do-file below to see what I mean. (To simplify output, instead of fitting separate regression models, I've just formed interaction terms and used only quartiles.)
            Code:
            version 16.1
            
            clear *
            
            set seed `=strreverse("1542573")'
            quietly set obs 100
            
            generate byte cid = _n
            generate double cid_u = rnormal(0, sqrt(2)) // Variance component for cluster of about 2
            generate double bpr = runiform(-0.5, 0.5) // Between predictor
            
            quietly expand 50
            generate double wpr = runiform(-0.5, 0.5) // Within predictor
            
            generate double out = cid_u + rnormal() // Residual variance of about 1
            
            *
            * Begin here
            *
            quietly centile out, centile(25 50 75)
            generate byte cut = 0
            forvalues i = 1/3 {
                quietly replace cut = cut + 1 if out > r(c_`i')
            }
            
            mixed out i.cut##(c.?pr) || cid: , nolrtest nolog
            
            mixed out c.?pr || cid: , nofetable noheader nolrtest nolog
            
            exit
            Perhaps, if you're interested only in the fixed effects estimates and you consider them truly orthogonal, then maybe the point estimate wouldn't be so biased, but—correct me if I'm wrong—it would still seem to complicate interpretation.

            Comment


            • #7
              I guess two quite different questions are being conflated here.

              1. HLM or anything else done separately for different subsets, here as determined by particular quantiles of some variable.

              2. Quantile regression.

              I don't see any link between the two at all except that quantiles appear in both.

              There is a common confusion of terminology underlying this. Quantiles are, historically, particular levels, but in many fields (e.g. various parts of applied economics) the term now is widely understood to mean the bins, classes, or intervals those levels delimit.

              To see which way you jump, answer this: Is the first quintile

              1. The value (a value, with a nod to @Jeff Wooldridge) such that 20% of values are lower and 80% lower? (Precisely what that means in terms of calculations with data is important detail.)

              2. The lowest 20% of a distribution on one variable, and by extension those observations in a dataset?

              More on terminology in the Appendix of https://journals.sagepub.com/doi/pdf...867X1601600413 (I've since encountered "trentiles".)

              None of this undermines or solves the key questions in #6.

              Comment


              • #8
                Originally posted by Nick Cox View Post
                I guess two quite different questions are being conflated here.

                1. HLM or anything else done separately for different subsets, here as determined by particular quantiles of some variable.

                2. Quantile regression.

                I don't see any link between the two at all except that quantiles appear in both.

                There is a common confusion of terminology underlying this. Quantiles are, historically, particular levels, but in many fields (e.g. various parts of applied economics) the term now is widely understood to mean the bins, classes, or intervals those levels delimit.

                To see which way you jump, answer this: Is the first quintile

                1. The value (a value, with a nod to @Jeff Wooldridge) such that 20% of values are lower and 80% lower? (Precisely what that means in terms of calculations with data is important detail.)

                2. The lowest 20% of a distribution on one variable, and by extension those observations in a dataset?

                More on terminology in the Appendix of https://journals.sagepub.com/doi/pdf...867X1601600413 (I've since encountered "trentiles".)

                None of this undermines or solves the key questions in #6.
                Thank you for the explanation. I think I get it now.

                I have learnt both quantile regression and "quintile regression" from a class. I did mix them up. When I tried to find a clear answer online, most articles talked about quantile regression. That made me really confused.

                Comment

                Working...
                X