Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adding content from two variables to one variable but to specific time points within that variable

    Hello All,

    My data are in long format. Each participant has 22 survey cycles. I have two variables that I want to combine into one variable. The first variable has data from survey cycle 8 to 11 and the 2nd one has data from survey cycle 15 to 18. In the new variable, I want the information from survey cycles 8-11 to be written for survey cycle 12, and the information from survey cycles 15-18 to be written for survey cycle 19. See the data example below:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte sc float(meanpriormpa12 meanpriormpa19 meanpriormpa)
     1     .  . .
     2     .  . .
     3     .  . .
     4     .  . .
     5     .  . .
     6     .  . .
     7     .  . .
     8 15.75  . .
     9 15.75  . .
    10 15.75  . .
    11 15.75  . .
    12     .  . .
    13     .  . .
    14     .  . .
    15     . 10 .
    16     . 10 .
    17     . 10 .
    18     . 10 .
    19     .  . .
    end
    For this first participant I want 15.75 to be written under meanpriormpa for sc 12, and 10 to be written for sc 19. I have succeeded at doing this by transforming my files to wide, calculating the mean variables i want. Recoding the sc to 12 or 19 to remerge the file with the original files. To me this seems like a very long road to get what I want. Does anyone have an approach that would be a lot quicker?

    Thank you
    Patrick

  • #2
    First, this entire enterprise is doomed unless you are certain that the values of meanpriormpa12 are always the same for sc = 8, 9, 10, and 11, and similarly that the values of meanpriormpa19 are always the same for meanpriormpa19. I will not assume that meanpriormpa12 is always missing when sc is not between 8 and 11 (nor the analogous for meanpriormpa19) although I could simplify the code if that assumption holds.

    You also refer to a "first participant" but there is nothing in your data that distinguishes one participant from another. I'll assume that your example is just the data from one participant and that you forgot to include the id variable when you ran -dataex-.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(id sc) float meanpriormpa12 byte(meanpriormpa19 meanpriormpa)
    1  1     .  . .
    1  2     .  . .
    1  3     .  . .
    1  4     .  . .
    1  5     .  . .
    1  6     .  . .
    1  7     .  . .
    1  8 15.75  . .
    1  9 15.75  . .
    1 10 15.75  . .
    1 11 15.75  . .
    1 12     .  . .
    1 13     .  . .
    1 14     .  . .
    1 15     . 10 .
    1 16     . 10 .
    1 17     . 10 .
    1 18     . 10 .
    1 19     .  . .
    end
    
    //    DEAL WITH 8 THROUGH 11 FIRST
    by id, sort: egen value = max(cond(sc == 8, meanpriormpa12, .))
    forvalues i = 9/11 {
        assert meanpriormpa12 == value if sc == `i'
    }
    replace meanpriormpa = value if sc == 12
    drop value
    
    //    DO 15 THROUGH 18 NEXT
    by id, sort: egen value = max(cond(sc == 15, meanpriormpa19, .))
    forvalues i = 16/18 {
        assert meanpriormpa19 == value if sc == `i'
    }
    replace meanpriormpa = value if sc == 19
    drop value
    .

    Comment


    • #3
      Hello Clyde,

      I do have an ID variable.
      To create meanpriormpa12 and meanpriormpa19 i used the following code
      Code:
      bysort new_pin: egen meanpriormpa12= mean(mpa) if sc<=11 & sc>=8
      bysort new_pin: egen meanpriormpa19= mean(mpa) if sc<=18 & sc>=15
      mpa is measured at every cycle, but obviously participants may be absent. As far as I remember egen computes the average based on available data. meanpriorpa12 and meanpriorpa19 will have data only on the scs for which it was restricted. Seeing your code I am wondering if I could have directly computed the meanpriorpa variable using fewer steps.
      Can I directly generate a variable that includes averages of two sets of 4 different survey cycles? Can I directly add the results of the average computation to sc 12 and sc 19?

      Thank you for your help!

      Comment


      • #4
        Sure.

        Code:
        by new_pin, sort: egen value = mean(cond(inrange(sc, 8, 11), mpa, .))
        replace meanpriormpa = value if sc == 12
        drop value
        
        by new_pin: egen value = mean(cond(inrange(sd, 15, 18), mpa, .))
        replace meanpriormpa = value if sc == 19
        drop value
        Note: Not tested. Beware of typos, mismatched parentheses, etc.

        Comment


        • #5
          Thank you Clyde!

          Comment

          Working...
          X