Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Median, Tercile and Quartile

    Hi, I want to create dummy variables based on the median. Sometiems I do that based on quartile and tercile. I do the following

    1. This will create a yearly median variable and median dummy

    egen ana= xtile(num), n(2) by(fyear)
    gen num_dummy = cond(missing(ana), ., (ana>1))

    2. This will create a yearly tercile variable and a dummy based on top tercile
    egen ana= xtile(num), n(3) by(fyear)
    gen num_dummy = cond(missing(ana), ., (ana>2))

    3. This will create a yearly quartile variable and a dummy based on top quartile
    egen ana= xtile(num), n(4) by(fyear)
    gen num_dummy = cond(missing(ana), ., (ana>3))

    Though I checked that the codes give me the correct answer even if I calculate them by using slightly different code, I want to make sure from the expert that I am doing it correctly.
    So, experts, could you please tell me whether I am doing things correctly or not?

  • #2
    The code you show appears to be correct for your stated purpose, assuming that you are attempting to identify observations in the top half (resp. third, fourth) value of num separately in each fyear. Note also that -xtile()- is not an official Stata -egen- function, so this code will only run if you have installed the -egenmore- package (available from SSC). Subject to those disclaimers, the code is correct.

    Comment


    • #3
      Clyde, thank you very much for your help.

      Comment


      • #4
        See https://stats.stackexchange.com/ques...half-a-percent for a collection of such terms. Tertile seems to be more common than tercile.

        Comment

        Working...
        X