Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Find out what centile of a distribution a given value represents

    I have a distribution (income). I would like to find out what percentile a given income (e.g. 500 USD) represents (and include this in a .do file to share with collaborators). Is this possible with an integrated Stata function?

  • #2
    Castor:
    do you mean something along the following lines?
    Code:
    . use "https://www.stata-press.com/data/r17/nlswork.dta"
    (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
    
    . help centile
    
    . centile ln_wage , centile(10(10)90)
    
                                                              Binom. interp.   
        Variable |       Obs  Percentile    Centile        [95% conf. interval]
    -------------+-------------------------------------------------------------
         ln_wage |    28,534         10    1.166102        1.157089    1.168778
                 |                   20    1.301507        1.297063    1.308635
                 |                   30    1.420336        1.410987    1.428017
                 |                   40    1.530165         1.52287    1.536367
                 |                   50    1.640541        1.633343    1.647948
                 |                   60    1.758746        1.753038    1.764279
                 |                   70     1.88922         1.87977    1.896542
                 |                   80    2.048963         2.04157    2.058625
                 |                   90     2.27569        2.266884    2.284872
    
    . sum ln_wage
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
         ln_wage |     28,534    1.674907    .4780935          0   5.263916
    
    .
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Ciao Carlo, No. The above can be easily found online.
      My question is: what exact percentage of the sample has an ln_wage below 1.147 (in the above example).

      Comment


      • #4
        Hi Castor
        There are perhaps 2 options for that.
        1) using the Empirical distribution:

        gen w1 = lnwage<1.147
        gen w2 = lnwage<=1.147
        sum w1 w2
        that will give you two alternatives of the Percentiles associated with 1,147 in your data. I use < and <= because I have seen both definitions for quantiles.
        The other option is Estimating the CDF (F(lnwage)) using semiparametric approach (like what kdensity do but for CDF not PDF). And then simply extrapolate F for 1.147

        If you want to use integrals, this is also possible, using kdensity and integ

        use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta
        kdensity lnwage, gen(fd) at(lnwage)
        integ fd lnwage if lnwage<=2

        HTH

        Comment


        • #5
          https://www.stata.com/support/faqs/s...ing-positions/ gives further discussion.

          Comment

          Working...
          X