Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to calculate the average of X in between two consecutive Y with missing Y observations


    Dear All, In my GMM analysis I explain depression_score with it's lag and average income occured in between two depression measurements. As I cannot include income in average income calculation if it is occured after depression measurement, I am trying to calculate the average income in between two consecutive depression measurement rounds by individuals in panel data. However, the window in between two consecutive round for depression is not fixed and there are missing observations for depression. If depression score is missing average should treat the week of first occurrence of round as the previous depression measurement week.

    My data looks like:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float ID byte week float round double income float depression_score
    1  1  1   .  .
    1  2  1   .  .
    1  3  1   .  .
    1  4  1   .  .
    1  5  1   .  .
    1  6  1   .  .
    1  7  1   .  .
    1  8  1   .  .
    1  9  1   .  .
    1 10  2   .  .
    1 11  2 1.5  .
    1 12  2   .  .
    1 13  3   .  4
    1 14  3   .  .
    1 15  3   .  .
    1 16  4   1  .
    1 17  4 1.5  0
    1 18  4   .  .
    1 19  4 1.5  .
    1 20  5   .  6
    1 21  5   .  .
    1 22  5 1.2  .
    1 23  5   0  .
    1 24  6   5  .
    end
    The code should take an average income reported in the week of current depression measurement up to the week of previous depression measurement (but not including that week).

    For example, average income is 0 for the first two waves as there is no income reported until the first depression measurement. In wave 3, average income will be calculated among week 11, 12, 13. As depression_score is missing in the previous wave 2, I restrict the window and exclude week 10 in average income of wave 3, which the first week of wave 2. For wave 4, only week 14,15 , 16 and 17 will be included in the calculation of average income. For wave 5 only week 18, 19, 20 will be included in the calculation. For wave 6, week 21,22,23,24 will be included in the calculation. The new variable will look like:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float ID byte week float round double income float depression_score double avg_income
    1  1  1   .  .  .
    1  2  1   .  .  .
    1  3  1   .  .  .
    1  4  1   .  .  .
    1  5  1   .  .  .
    1  6  1   .  .  .
    1  7  1   .  .  .
    1  8  1   .  .  .
    1  9  1   .  .  .
    1 10  2   .  .  .
    1 11  2 1.5  .  .
    1 12  2   .  .  .
    1 13  3   .  4  1.5 
    1 14  3   .  .
    1 15  3   .  .
    1 16  4   1  .
    1 17  4 1.5  0  1.25
    1 18  4   .  .
    1 19  4 1.5  .
    1 20  5   .  6  1.5
    1 21  5   .  .
    1 22  5   4  .
    1 23  5   0  .
    1 24  6   5  .  3
    end
    Would rangestat command be helpful? Do you have any advice on writing conditions while taking the average of income?

    Nick Cox

    Best regards,
    Nursena

  • #2
    I used tsspell from SSC after reversing time (temporarily).

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float ID byte week float round double income float depression_score
    1  1  1   .  .
    1  2  1   .  .
    1  3  1   .  .
    1  4  1   .  .
    1  5  1   .  .
    1  6  1   .  .
    1  7  1   .  .
    1  8  1   .  .
    1  9  1   .  .
    1 10  2   .  .
    1 11  2 1.5  .
    1 12  2   .  .
    1 13  3   .  4
    1 14  3   .  .
    1 15  3   .  .
    1 16  4   1  .
    1 17  4 1.5  0
    1 18  4   .  .
    1 19  4 1.5  .
    1 20  5   .  6
    1 21  5   .  .
    1 22  5 1.2  .
    1 23  5   0  .
    1 24  6   5  .
    end
    
    gen negweek = -week 
    tsset ID negweek 
    tsspell, fcond(depression_score < .)
    
    egen wanted1 = mean(income), by(ID _spell)
    gen wanted2 = wanted1 if _seq == 1 
    
    sort ID week 
    
    list ID week income depression_score wanted?, sepby(ID _spell)
    
         +----------------------------------------------------+
         | ID   week   income   depres~e    wanted1   wanted2 |
         |----------------------------------------------------|
      1. |  1      1        .          .        1.5         . |
      2. |  1      2        .          .        1.5         . |
      3. |  1      3        .          .        1.5         . |
      4. |  1      4        .          .        1.5         . |
      5. |  1      5        .          .        1.5         . |
      6. |  1      6        .          .        1.5         . |
      7. |  1      7        .          .        1.5         . |
      8. |  1      8        .          .        1.5         . |
      9. |  1      9        .          .        1.5         . |
     10. |  1     10        .          .        1.5         . |
     11. |  1     11      1.5          .        1.5         . |
     12. |  1     12        .          .        1.5         . |
     13. |  1     13        .          4        1.5       1.5 |
         |----------------------------------------------------|
     14. |  1     14        .          .       1.25         . |
     15. |  1     15        .          .       1.25         . |
     16. |  1     16        1          .       1.25         . |
     17. |  1     17      1.5          0       1.25      1.25 |
         |----------------------------------------------------|
     18. |  1     18        .          .        1.5         . |
     19. |  1     19      1.5          .        1.5         . |
     20. |  1     20        .          6        1.5       1.5 |
         |----------------------------------------------------|
     21. |  1     21        .          .   2.066667         . |
     22. |  1     22      1.2          .   2.066667         . |
     23. |  1     23        0          .   2.066667         . |
     24. |  1     24        5          .   2.066667         . |
         +----------------------------------------------------+

    Comment


    • #3
      Dear Nick,

      Thank you for your reply. I have a panel data. How can I transform your code for panel data rather than for time-series?

      I did not put the full data table in the previous post to avoid long question. Please see the full dataex below.

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float ID byte week float round double income float depression_score
      1  1  1   .  .
      1  2  1   .  .
      1  3  1   .  .
      1  4  1   .  .
      1  5  1   .  .
      1  6  1   .  .
      1  7  1   .  .
      1  8  1   .  .
      1  9  1   .  .
      1 10  2   .  .
      1 11  2 1.5  .
      1 12  2   .  .
      1 13  3   .  4
      1 14  3   .  .
      1 15  3   .  .
      1 16  4   1  .
      1 17  4 1.5  0
      1 18  4   .  .
      1 19  4 1.5  .
      1 20  5   .  0
      1 21  5   .  .
      1 22  5 1.2  .
      1 23  5   0  .
      1 24  6   5  .
      1 25  6   .  .
      1 26  6   0  .
      1 27  6   2  .
      1 28  7   2  1
      1 29  7   6  .
      1 30  7   5  .
      1 31  7 4.6  .
      1 32  8   7  5
      1 33  8 2.5  .
      1 34  8   4  .
      1 35  8   4  .
      1 36  9   0  7
      1 37  9   1  .
      1 38  9   0  .
      1 39  9   1  .
      1 40 10 2.4  4
      1 41 10   0  .
      1 42 10   0  .
      1 43 10   0  .
      1 44 11   0 16
      1 45 11  .5  .
      1 46 11   0  .
      1 47 11   1  .
      1 48 11   1  .
      1 49 11   2  .
      1 50 11   3  .
      1 51 12   2  2
      1 59 12   .  .
      1 60 12   .  .
      1 61 12   .  .
      1 62 13   .  .
      1 63 13   .  .
      1 64 13   .  .
      1 65 13   .  .
      1 66 14   .  .
      1 67 14   .  .
      1 68 14   .  .
      1 69 14   .  .
      1 70 15   .  .
      1 71 15   .  .
      1 72 15   .  .
      1 73 15   .  .
      1 74 16   .  .
      1 75 16   .  .
      1 76  .   .  .
      1 77  .   .  .
      1 78 17   .  .
      2  1  1   .  .
      2  2  1   .  .
      2  3  1   .  .
      2  4  1   .  .
      2  5  1   .  .
      2  6  1   .  .
      2  7  1   .  .
      2  8  1   .  .
      2  9  1   .  .
      2 10  2   .  .
      2 11  2   .  .
      2 12  2   .  .
      2 13  3   .  .
      2 14  3   .  .
      2 15  3   .  .
      2 16  4   .  .
      2 17  4   .  .
      2 18  4   .  .
      2 19  4 6.5  .
      2 20  5 9.5 10
      2 21  5   .  .
      2 22  5  10  .
      2 23  5   .  .
      2 24  6   0  3
      2 25  6   0  .
      2 26  6   .  .
      2 27  6   2  .
      2 28  7   1 19
      2 29  7   2  .
      end


      Comment


      • #4
        I deliberately included ID in the code to accommodate panel data. Please examine it closely.

        Comment


        • #5
          Thank you, Nick! However, my output is different than yours for the rows written in red as I have more data. Please see it below. I think when I run the code on my computer, it only calculates the average in between two non-missing depression scores. But if depression score is missing in the previous wave I would like to take the first week of the missing wave as a spell limit for my calculation. So for wave 6 (where depression score is missing), average income should be calculated with week 21,22,23,24. For wave 7, average income should be calculated with week 25,26,27,28. How can I add this condition?

          Originally posted by Nick Cox View Post
          I used tsspell from SSC after reversing time (temporarily).

          Code:
          +----------------------------------------------------+
          | ID week income depres~e wanted1 wanted2 |
          |----------------------------------------------------|
          1. | 1 1 . . 1.5 . |
          2. | 1 2 . . 1.5 . |
          3. | 1 3 . . 1.5 . |
          4. | 1 4 . . 1.5 . |
          5. | 1 5 . . 1.5 . |
          6. | 1 6 . . 1.5 . |
          7. | 1 7 . . 1.5 . |
          8. | 1 8 . . 1.5 . |
          9. | 1 9 . . 1.5 . |
          10. | 1 10 . . 1.5 . |
          11. | 1 11 1.5 . 1.5 . |
          12. | 1 12 . . 1.5 . |
          13. | 1 13 . 4 1.5 1.5 |
          |----------------------------------------------------|
          14. | 1 14 . . 1.25 . |
          15. | 1 15 . . 1.25 . |
          16. | 1 16 1 . 1.25 . |
          17. | 1 17 1.5 0 1.25 1.25 |
          |----------------------------------------------------|
          18. | 1 18 . . 1.5 . |
          19. | 1 19 1.5 . 1.5 . |
          20. | 1 20 . 6 1.5 1.5 |
          |----------------------------------------------------|
          21. | 1 21 . . 2.066667 . |
          22. | 1 22 1.2 . 2.066667 . |
          23. | 1 23 0 . 2.066667 . |
          24. | 1 24 5 . 2.066667 . |
          +----------------------------------------------------+
          My output:


          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input float ID byte week float wave double income float(cesd_score wanted1 wanted2)
          1  1 1   . .  1.5    .
          1  2 1   . .  1.5    .
          1  3 1   . .  1.5    .
          1  4 1   . .  1.5    .
          1  5 1   . .  1.5    .
          1  6 1   . .  1.5    .
          1  7 1   . .  1.5    .
          1  8 1   . .  1.5    .
          1  9 1   . .  1.5    .
          1 10 2   . .  1.5    .
          1 11 2 1.5 .  1.5    .
          1 12 2   . .  1.5    .
          1 13 3   . 4  1.5  1.5
          1 14 3   . . 1.25    .
          1 15 3   . . 1.25    .
          1 16 4   1 . 1.25    .
          1 17 4 1.5 0 1.25 1.25
          1 18 4   . .  1.5    .
          1 19 4 1.5 .  1.5    .
          1 20 5   . 0  1.5  1.5
          1 21 5   . .  1.7    .
          1 22 5 1.2 .  1.7    .
          1 23 5   0 .  1.7    .
          1 24 6   5 .  1.7    .
          1 25 6   . .  1.7    .
          1 26 6   0 .  1.7    .
          1 27 6   2 .  1.7    .
          1 28 7   2 1  1.7  1.7
          1 29 7   6 . 5.65    .
          1 30 7   5 . 5.65    .
          end
          Last edited by Nursena Sagir; 19 Dec 2023, 06:10.

          Comment


          • #6

            Then you need to calculate something like

            Code:
            egen wanted3 = mean(income), by(ID wave)
            and use that to fill in missings.

            Comment

            Working...
            X