Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to divide the samples into deciles?

    Hi! I am new to Stata

    I need to divide my sample into deciles in each year and industry.
    So I have 20 years and 48 industries. And I need to rank the firms cash variable into deciles in each year and industry.

    Could anyone help me?

    Thank you very much

  • #2
    Christika:
    see -help pctile-
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      The -xtile()- function from the -egenmore- package (SSC) will do the trick.

      To install the -egenmore- package type
      Code:
      ssc install egenmore
      from within a web-aware Stata

      Comment


      • #4
        I just finished installing egenmore. And then how do I use it? I'm sorry. I'm quite new, and do not really understand it. Could you elaborate more?

        Comment


        • #5
          I assume that your year, industry, and cash variables are named year industry and cash, respectively.
          This code will give you deciles of cash within each combination gf year and industry

          Code:
          egen deciles = xtile(cash), by(year industry) nq(10)


          Comment


          • #6
            thank you very much!

            Comment


            • #7
              Christika:
              as you did not provide any excerpt/example about your data (for the future, please act on this recommendation using -dataex-), I do hope that the following toy-example (that imples the user-written command -egenmore-) will be helpful:
              Code:
              . set obs 48
              
              . g id=_n
              
              . expand 20
              
              . bysort id: g year=_n
              
              . g cash=runiform()*100000
              
              . egen deciles_firm=xtile( cash ), by( id ) nq(10)
              
              . egen deciles_year=xtile( cash ), by( year ) nq(10)
              
              . tab deciles_firm
              
              deciles_fir |
                        m |      Freq.     Percent        Cum.
              ------------+-----------------------------------
                        1 |         96       10.00       10.00
                        2 |         96       10.00       20.00
                        3 |         96       10.00       30.00
                        4 |         96       10.00       40.00
                        5 |         96       10.00       50.00
                        6 |         96       10.00       60.00
                        7 |         96       10.00       70.00
                        8 |         96       10.00       80.00
                        9 |         96       10.00       90.00
                       10 |         96       10.00      100.00
              ------------+-----------------------------------
                    Total |        960      100.00
              
              . tab deciles_year
              
              deciles_yea |
                        r |      Freq.     Percent        Cum.
              ------------+-----------------------------------
                        1 |        100       10.42       10.42
                        2 |        100       10.42       20.83
                        3 |        100       10.42       31.25
                        4 |        100       10.42       41.67
                        5 |         80        8.33       50.00
                        6 |        100       10.42       60.42
                        7 |        100       10.42       70.83
                        8 |        100       10.42       81.25
                        9 |        100       10.42       91.67
                       10 |         80        8.33      100.00
              ------------+-----------------------------------
                    Total |        960      100.00
              PS: crossed in the cyberspace with Andrea's helpful reply, who has a different take on the query posted by Christika.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Carlo Lazzaro azzaro didn't set seed but I can reproduce his results with this script in Stata 15.1.

                Code:
                clear
                set obs 48
                g id=_n
                expand 20
                bysort id: g year=_n
                g cash=runiform()*100000
                egen deciles_firm=xtile( cash ), by( id ) nq(10)
                egen deciles_year=xtile( cash ), by( year ) nq(10)
                tab deciles_firm
                tab deciles_year


                I was puzzled a little by the last table, but a closer look makes all clear and makes a small but important point about this kind of binning.

                Carlo has 20 years and 48 observations in each. His random uniform deviates turn out all to be distinct (using
                distinct (Stata Journal)), so when binning by year why aren't there equal numbers of each bin label 1 ... 10?

                Code:
                . distinct cash
                
                -----------------------------
                      |     total   distinct
                ------+----------------------
                 cash |       960        960
                -----------------------------


                The answer is that each binning of 48 values can at best yield 8 bins with 5 and 2 bins with 4 when you try to divide 48 into 10 bins ideally of equal size. What is true of each binning is true for all.

                FWIW, such binning seems oversold to me. If you want to record relative position as well as absolute value, why not use some variant on rank / sample size (percentile rank, in one terminology)?

                More at

                https://www.stata-journal.com/articl...article=dm0095

                https://www.stata-journal.com/sjpdf....iclenum=pr0054

                https://www.stata.com/support/faqs/s...ing-positions/

                Comment

                Working...
                X