Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Efficient command to define Status by id and year

    Hi Statalists,

    I have a quick question about writing an efficient command to define the status of each individual every year. The hypothetical data I created looks like the following:

    Code:
    id    year    status
    1      19       1
    1      20       1
    1      21       1
    2      19       0
    2      20       1
    2      21       1
    3      19       0
    3      20       0
    3      21       1
    4      19       0
    4      20       0
    4      21       0
    5      19       1
    5      20       1
    5      21       1
    6      20       1
    6      21       1
    7      19       0
    7      21       1
    Basically, I want to convert the data from long form to wide form by defining three new columns: stat19, stat20, and stat21. I am aware of the "reshape" function, but my colleague prefers to keep it this way. I can successfully accomplish what I want by using helper columns. The code I use is as follows:

    Code:
    gen helper1 = 1 if (year == 19 & status == 1)
    replace helper = 0f if (year == 20 & status == 0)
    bysort id: egen stat19 = max(helper1)
    replace stat19 = 0 if stat19 == .
    
    id year status helper1 stat19
    1    19    1      1      1
    1    20    1      .      1
    1    21    1      .      1
    2    19    0      0      0
    2    20    1      .      0
    2    21    1      .      0
    3    19    0      0      0
    3    20    0      .      0
    3    21    1      .      0
    4    19    0      0      0
    4    20    0      .      0
    4    21    0      .      0
    5    19    1      1      1
    5    20    1      .      1
    5    21    1      .      1
    6    20    1      .      0
    6    21    1      .      0
    7    19    0      0      0
    7    21    1      .      0
    As you can see, my code requires several rows to complete, and I need to do this for several columns.
    QUESTION: I am wondering if there is a better and more effective way to achieve the same goal. I tried using "egen", but I couldn't think of any condition that would satisfy my case. On the other hand, the "gen" function doesn't allow me to use the "by" argument. Another condition I have is that I cannot install any packages, as the server at my university doesn't allow me to do so.

    Would it be possible to complete this task by using just a few lines of code? I would appreciate any suggestions the community may have.

    Thank you so much for your help!

  • #2
    Code:
    levelsof year, local(years)
    foreach year of local years{
        bysort id: egen stat`year'= max(year==`year' & status)
    }

    Comment


    • #3
      This is tangential, but:
      On the other hand, the "gen" function doesn't allow me to use the "by" argument.
      I'm not sure what you mean by this since -gen- is a command, not a function, and -by- will either be a prefix or an option in a command, but not an argument. But if you are trying to say that the -gen- command does not support the -by:- prefix that is simply not true. If you have encountered an instance where that is not allowed it is almost certainly a bug and you should post back showing exactly the code you ran and the output that Stata produced (including any error messages).

      Comment


      • #4
        Andrew Musau Thank you so much! The code works perfectly. It turned out to be the simplest solution that I had not considered.

        Comment


        • #5
          Clyde Schechter Thank you for catching that mistake. For some reason, when I tried the -gen- command with -by-prefix, it did not work as expected. Let me try to do this again and see if I still get the error.

          Comment

          Working...
          X