Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Collapse and generate average values

    Dear,

    I want to collapse the years into periods of, say, two years each. For example, if I have a panel from 1990 to 1995, I want period I = 1990-1991, period II = 1992-1993, and period III = 1993 & 1994. Within each period, I will then calculate the average (mean) or identify the mode for each variable.

    How can I do that in Stata?

    Let me give an example with the following made-up dataset:

    input str10 country year x y
    "Indonesia" 1990 24 36
    "Indonesia" 1991 28 22
    "Korea" 1990 38 27
    "Korea" 1991 42 73
    "China" 1990 124 458
    "China" 1991 12 24
    end

    I want to collapse 1990 and 1991 into just one period, so the x-value for Indonesia would be (24 + 28)/2 and the corresponding y-value (36 + 22)/2.

    Now, imagine I also have the row for years 1992-2021. That means I can't just do "collapse varlist, by (country)". I will first need to create a variable called "period" that takes the value = 1 for 1990 & 1991, = 2 for 1992 & 1993, and so on.

    I could do that manually one by one, but there must be a smart way to do this in Stata.

    Is it possible to get help?

    Best,

  • #2
    To create the variable "period",
    Code:
    g x = year - mod(year,2)
    bys country (year): egen period = group(x)

    Comment


    • #3
      I take it you want 1990-91, 1992-93, 1994-95, not as posted.

      If the year is odd (leaves remainder 1 on division by 2), you can bump it down (or if the year is even, you can bump it down.

      Code:
      clear 
      
      input str10 country year x y
      "Indonesia" 1990 24 36
      "Indonesia" 1991 28 22
      "Korea" 1990 38 27
      "Korea" 1991 42 73
      "China" 1990 124 458
      "China" 1991 12 24
      end
      
      replace year = year - 1 if mod(year, 2)
      
      collapse x y, by(country year)
      
      list 
      
           +-----------------------------+
           |   country   year    x     y |
           |-----------------------------|
        1. |     China   1990   68   241 |
        2. | Indonesia   1990   26    29 |
        3. |     Korea   1990   40    50 |
           +-----------------------------+
      I can't see that the mode of two values is well defined unless they are identical. So, collapse by default calculates means, and that default is used here.

      Comment


      • #4
        Otherwise put, in general the mode of two values can only be estimated as the mean (in the absence of other considerations).

        Comment


        • #5
          I appreciate both of you. I am still confused about the first command, but I will figure it out!!
          replace year = year - 1 if mod(year, 2)

          Comment


          • #6
            The command,
            Code:
            replace year = year - 1 if mod(year,2)
            tells Stata to subtract 1 from odd years, i.e., years that when divided by 2 leave a remainder of 1.

            Comment

            Working...
            X