Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Grouping individual years into year groups in panel data

    Hello
    I have data for more than 1000 municipalities in Colombia, and for each, 20 years of observations. I was wondering how I could group the years. I mean, instead of having 1990, 1991, 1992... 2018 to have 1990-1992, 1993-1995, 1996-1998 and so on. Then I want to add the values of the variables in each group of years. The code I am using is

    gen group=.
    replace group=19901992 if year>=1990 & year<=1992
    replace group=19931995 if year>=1993 & year<=1995
    replace group=19951998 if year>=1996 & year<=1998
    replace group=19982001 if year>=1990 & year<=2001
    bysort group: egen sum_var=sum(var)

    for each of the variables, but in the end i get all my results the same rather than grouped by each year group.

    Thank you

  • #2
    So, a little more generally, how to bin integers using width 3, or indeed any integer width?

    I wouldn't use such identifiers as in your example, as being awkward to type for one, but just the start or end year (or if preferred the middle year if bin length is odd).

    School arithmetic identifies 1989 as divisible by 3 because the sum of its digits is so divisible. Therefore with a translation to the left, division, flooring, multiplication, and a translation back to the right, we can get this:

    Code:
    clear
    set obs 15
    gen year = 1989 + _n
    
    gen wanted = 3 * floor((year - 1)/3) + 1
    
    tab year wanted
    
               |                         wanted
          year |      1990       1993       1996       1999       2002 |     Total
    -----------+-------------------------------------------------------+----------
          1990 |         1          0          0          0          0 |         1
          1991 |         1          0          0          0          0 |         1
          1992 |         1          0          0          0          0 |         1
          1993 |         0          1          0          0          0 |         1
          1994 |         0          1          0          0          0 |         1
          1995 |         0          1          0          0          0 |         1
          1996 |         0          0          1          0          0 |         1
          1997 |         0          0          1          0          0 |         1
          1998 |         0          0          1          0          0 |         1
          1999 |         0          0          0          1          0 |         1
          2000 |         0          0          0          1          0 |         1
          2001 |         0          0          0          1          0 |         1
          2002 |         0          0          0          0          1 |         1
          2003 |         0          0          0          0          1 |         1
          2004 |         0          0          0          0          1 |         1
    -----------+-------------------------------------------------------+----------
         Total |         3          3          3          3          3 |        15
    Although it often seems implied that binning is something you just do, I couldn't find much by way of a systematic treatment and ended up trying to write one myself (in fact two):

    Code:
    . search binning, sj
    
    Search of official help files, FAQs, Examples, SJs, and STBs
    
    SJ-18-3 dm0095  . . . . . . . . . . . Speaking Stata: From rounding to binning
            . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
            Q3/18   SJ 18(3):741--754                                (no commands)
            basic review of how to bin variables in Stata, meaning how to
            divide their range or support into disjoint intervals
    
    SJ-18-1 gr0072  . . . . . . . Speaking Stata: Logarithmic binning and labeling
            (help niceloglabels)  . . . . . . . . . . . . . . . . . . .  N. J. Cox
            Q1/18   SJ 18(1):262--286
            introduces the niceloglabels command for helping (even automating)
            label choice

    Comment


    • #3
      Welcome to Statalist.

      I think that after creating the group variable, you want to use the collapse command to create one observation for each combination of municipality and group. Something like
      Code:
      collapse (sum) varlist , by(municipality group)

      Comment

      Working...
      X