Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How do I expand period data into data by year?

    Hello, relatively new user here, using stata for my university class and have a base level proficiency at it, and have self taught myself a good amount of what I have needed so far from this forum.

    I am trying to merge two datasets together for a regression. The problem is that one of the datasets is paneled by country/year while the other one is country/period. These periods are 5 years each. What I am trying to do is duplicate each period 5 times, and then either rename the period variable or add a year variable so that each of those duplicates now represents one year from the period. For example, for period one, which is 1960-1964, I want to duplicate the period one observation 5 times, and rename so that there is one for 1960, 1961,etc. This way I can merge it smoothly with my other dataset. I have done some research and found the 'expand' command, but I could find no way to have each duplicate have a different name, making it difficult for me to give them each a unique year in an efficient way.

    Any help would be appreciated, and apologies if this is a question that has already been answered, did my best to search but may have not used the right key words.

  • #2
    Instead of duplicating the country/period data into 5 yearly copies, in your country/year dataset calculate a new "period" variable that is - following your description - 1 for years 1960-1964, 2 for 1965-1969, etc.

    So something like
    Code:
    use country_year_data
    generate period = 1+floor((year-1960)/5)
    merge m:1 country period using country_period_data
    where the floor() function rounds down a fraction to an integer.
    Last edited by William Lisowski; 13 May 2022, 15:53.

    Comment


    • #3
      That makes total sense, I definitely got too focused on solving the problem in a much more complicated way. This was incredibly helpful, really grateful for the time you took and the response!

      Comment


      • #4
        @WIlliam Lisowski's theme is expanded upon in https://www.stata-journal.com/articl...article=dm0095


        Here's another way to do it:

        Code:
        . clear
        
        . set obs 10
        Number of observations (_N) was 0, now 10.
        
        . gen year = 1959 + _n
        
        . gen period = ceil((year - 1959) / 5)
        
        . l
        
             +---------------+
             | year   period |
             |---------------|
          1. | 1960        1 |
          2. | 1961        1 |
          3. | 1962        1 |
          4. | 1963        1 |
          5. | 1964        1 |
             |---------------|
          6. | 1965        2 |
          7. | 1966        2 |
          8. | 1967        2 |
          9. | 1968        2 |
         10. | 1969        2 |
             +---------------+

        Comment

        Working...
        X