Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bucketing observations based on a variable value in a particular year in panel data

    Hello

    I've got a panel dataset that has firms observed in years 2005 - 2013. I would like to place firms into ten buckets based on their incomegrowth in 2009. And then table (or better - plot) mean incomegrowth of each bucket for 2006 - 2013. I've used the code below to create the buckets in 2009. But how do I assign the same bucket value for all other years of the same id?
    gen growthcat2009 = .

    _pctile incomegrowth_p if year==2009, p(10(10)90)
    replace growthcat2009=10 if incomegrowth_p>r(r1) & year==2009
    replace growthcat2009=20 if incomegrowth_p>r(r2) & year==2009
    replace growthcat2009=30 if incomegrowth_p>r(r3) & year==2009
    replace growthcat2009=40 if incomegrowth_p>r(r4) & year==2009
    replace growthcat2009=50 if incomegrowth_p>r(r5) & year==2009
    replace growthcat2009=60 if incomegrowth_p>r(r6) & year==2009
    replace growthcat2009=70 if incomegrowth_p>r(r7) & year==2009
    replace growthcat2009=80 if incomegrowth_p>r(r8) & year==2009
    replace growthcat2009=90 if incomegrowth_p>r(r9) & year==2009


    Thank you
    Tamara

  • #2
    This non-sensical example may help:

    Code:
    clear all
    set more off
    
    *----- example data -----
    
    sysuse bplong
    list in 1/20
    
    *----- what you want -----
    
    * create variable containing quantile categories for -when == 1-
    xtile bpcat = bp if when == 1, nquantiles(9)
    
    * check that there's only one non-missing value per patient
    bysort patient (bpcat): assert bpcat[1] != . & bpcat[2] == .
    
    * assign same group category for -when != 1-
    by patient: gen bpcat2 = bpcat[1]
    
    list in 1/20, sepby(patient)
    You should:

    1. Read the FAQ carefully.

    2. "Say exactly what you typed and exactly what Stata typed (or did) in response. N.B. exactly!"

    3. Describe your dataset. Use list to list data when you are doing so. Use input to type in your own dataset fragment that others can experiment with.

    4. Use the advanced editing options to appropriately format quotes, data, code and Stata output. The advanced options can be toggled on/off using the A button in the top right corner of the text editor.

    Comment


    • #3
      Thank you, Roberto!!! This is super-helpful!
      One question - my panel is unbalanced. How do I keep only the ids that had a non-missing growth observation in 2009?

      Comment


      • #4
        Tamara,

        You're welcome.

        I think you want to drop all observations of any one person who has 2009 missing.

        One way (and example) is:

        Code:
        *clear all
        set more off
        
        *----- example data -----
        
        webuse nlswork
        keep idcode year age
        
        replace age = . in 4
        replace age = . in 30
        
        list if inrange(idcode,1,3), sepby(idcode)
        
        *----- what you want -----
        
        bysort idcode: gen flag = (year == 73 & missing(age))
        bysort idcode (flag): drop if flag[_N]
        
        list if inrange(idcode,1,3), sepby(idcode)
        The condition (year == 73 & missing(age)) will either be true (1) or false (0),
        generating a 0/1 indicator variable. Afterwards, sort that indicator variable
        per person group. If there is a 1, it will be sorted to the last place, which
        is why we use _N.

        To keep or drop observations, see help keep and help drop. See also help subscripting,
        help by and

        Cox, Nicholas J. “Speaking Stata: How to Move Step by: Step.”
        Stata Journal 2, no. 1 (2002): 86–102.

        (The first bysort: you can maybe omit.)
        Last edited by Roberto Ferrer; 04 Jul 2014, 13:45.
        You should:

        1. Read the FAQ carefully.

        2. "Say exactly what you typed and exactly what Stata typed (or did) in response. N.B. exactly!"

        3. Describe your dataset. Use list to list data when you are doing so. Use input to type in your own dataset fragment that others can experiment with.

        4. Use the advanced editing options to appropriately format quotes, data, code and Stata output. The advanced options can be toggled on/off using the A button in the top right corner of the text editor.

        Comment


        • #5
          Thank you so much, Roberto! All works now!! And all the references look v useful too. Will have a read of "Speaking stata..." now :-)

          Comment

          Working...
          X