Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • calculate time-average of time-varying variables for period t=2,...,T in panel dataset

    Dear all,

    I have unbalanced panel, and I need to calculate time-average of time-varying variables for period t=2,...,T. i.e. I need to exclude initial period observations when calculating the average. Also, I don't want moving average, the average should be constant for all years for the same person. Please find union dataset as example. I want to have for the first individual for all years average (25+28+31+...)/6, for the second individual for all years average (25+26+28+...)/8, etc. How can I calculate this?

    Thank you in advanced.
    Best regards,
    Aleksandra

    Code:
      
    webuse union          
    idcode    year    age    
                
    1    72    20    
    1    77    25    
    1    80    28    
    1    83    31    
    1    85    33    
    1    87    35    
    1    88    37    
                
    2    71    19    
    2    77    25    
    2    78    26    
    2    80    28    
    2    82    30    
    2    83    31    
    2    85    33    
    2    87    35    
    2    88    37

  • #2
    Code:
    by id (year), sort: gen first = _n == 1
    by id (year): egen wanted = mean(cond(!first, age, .))

    Comment


    • #3
      Aleksandra:
      do you mean something along the following lines?
      Code:
      . webuse union          
      (NLS Women 14-24 in 1968)
      . bysort idcode: egen wanted=mean(age) if _n>1
      (4,434 missing values generated)
      
      . list if idcode==1
      
             +-------------------------------------------------------------------------+
             | idcode   year   age   grade   not_smsa   south   union   black   wanted |
             |-------------------------------------------------------------------------|
          1. |      1     72    20      12          0       0       1       1        . |
          2. |      1     77    25      12          0       0       0       1     31.5 |
          3. |      1     80    28      12          0       0       1       1     31.5 |
          4. |      1     83    31      12          0       0       1       1     31.5 |
          5. |      1     85    33      12          0       0       1       1     31.5 |
             |-------------------------------------------------------------------------|
          6. |      1     87    35      12          0       0       1       1     31.5 |
          7. |      1     88    37      12          0       0       1       1     31.5 |
             +-------------------------------------------------------------------------+
      
      .
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Dear Clyde and Carlo,

        Thank you very much for immediate reponse.

        Best,
        Aleksandra

        Comment


        • #5
          Originally posted by Carlo Lazzaro View Post
          Aleksandra:
          do you mean something along the following lines?
          Code:
          . webuse union
          (NLS Women 14-24 in 1968)
          . bysort idcode: egen wanted=mean(age) if _n>1
          (4,434 missing values generated)
          
          . list if idcode==1
          
          +-------------------------------------------------------------------------+
          | idcode year age grade not_smsa south union black wanted |
          |-------------------------------------------------------------------------|
          1. | 1 72 20 12 0 0 1 1 . |
          2. | 1 77 25 12 0 0 0 1 31.5 |
          3. | 1 80 28 12 0 0 1 1 31.5 |
          4. | 1 83 31 12 0 0 1 1 31.5 |
          5. | 1 85 33 12 0 0 1 1 31.5 |
          |-------------------------------------------------------------------------|
          6. | 1 87 35 12 0 0 1 1 31.5 |
          7. | 1 88 37 12 0 0 1 1 31.5 |
          +-------------------------------------------------------------------------+
          
          .
          Dear Carlo,

          Your code appear to work only for the first idcode, for the others it gives wrong averages, i.e. it includes first period. I corrected your code. Have a look please. I don't understand how wanted1 and wanted2 are not the same, since in wanted2 the conditions are the same, the only difference is that in wanted2 there are two steps.

          Thanks.
          Best,
          Aleksandra

          Code:
                                  
          
          webuse union, clear
          bysort idcode: egen wanted1=mean(age) if _n>1
          bysort idcode: gen counter=_n
          bysort idcode: egen wanted2=mean(age) if counter>1
          
          format wanted1 wanted2 %9.1f
          list idcode year wanted1 wanted2 counter age in 8/37, sepby(idcode) noobs
                                  
          idcode    year    wanted1    wanted2    counter    age    
                                  
          2    71    29.3       .    1    19    
          2    77    29.3    30.6    2    25    
          2    78    29.3    30.6    3    26    
          2    80    29.3    30.6    4    28    
          2    82    29.3    30.6    5    30    
          2    83    29.3    30.6    6    31    
          2    85    29.3    30.6    7    33    
          2    87    29.3    30.6    8    35    
          2    88    29.3    30.6    9    37    
                                  
          3    70    32.8      .    1    24    
          3    71    32.8    33.6    2    25    
          3    72    32.8    33.6    3    26    
          3    73    32.8    33.6    4    27    
          3    77    32.8    33.6    5    31    
          3    78    32.8    33.6    6    32    
          3    80    32.8    33.6    7    34    
          3    82    32.8    33.6    8    36    
          3    83    32.8    33.6    9    37    
          3    85    32.8    33.6    10    39    
          3    87    32.8    33.6    11    41    
          3    88    32.8    33.6    12    42    
                                  
          4    70    33.9      .    1    24    
          4    71    33.9    35.1    2    25    
          4    72    33.9    35.1    3    26    
          4    80    33.9    35.1    4    34    
          4    82    33.9    35.1    5    36    
          4    83    33.9    35.1    6    37    
          4    85    33.9    35.1    7    39    
          4    87    33.9    35.1    8    41    
          4    88    33.9    35.1    9    43

          Comment


          • #6
            The use of _n and _N to denote observations within -by- groups, one of the most powerful features of Stata, is hazardous when used with -egen-. The problem is that -egen- sometimes re-sorts the data, so that _n and _N don't always correspond to what they were before -egen- runs. So, if you want to use -egen- to do something that depends on order within a by-group, you have to encode that order in a variable first. The code for wanted2 shown in #5 does that. My code in #2 does that partially, encoding the first vs other distinction, which is the only part of the ordering that matters in this particular situation.

            Comment


            • #7
              Aleksandra:
              both codes seem to do the very same job at this end:

              Code:
              . bysort idcode: egen wanted=mean(age) if _n>1
              
              . bysort idcode: gen counter=_n
              
              . bysort idcode: egen wanted2=mean(age) if counter>1
              
              . list if idcode<=2
              
                     +---------------------------------------------------------------------------------------------+
                     | idcode   year   age   grade   not_smsa   south   union   black   wanted   counter   wanted2 |
                     |---------------------------------------------------------------------------------------------|
                  1. |      1     72    20      12          0       0       1       1        .         1         . |
                  2. |      1     77    25      12          0       0       0       1     31.5         2      31.5 |
                  3. |      1     80    28      12          0       0       1       1     31.5         3      31.5 |
                  4. |      1     83    31      12          0       0       1       1     31.5         4      31.5 |
                  5. |      1     85    33      12          0       0       1       1     31.5         5      31.5 |
                     |---------------------------------------------------------------------------------------------|
                  6. |      1     87    35      12          0       0       1       1     31.5         6      31.5 |
                  7. |      1     88    37      12          0       0       1       1     31.5         7      31.5 |
                  8. |      2     71    19      12          0       0       0       1        .         1         . |
                  9. |      2     77    25      12          0       0       1       1   30.625         2    30.625 |
                 10. |      2     78    26      12          0       0       1       1   30.625         3    30.625 |
                     |---------------------------------------------------------------------------------------------|
                 11. |      2     80    28      12          0       0       1       1   30.625         4    30.625 |
                 12. |      2     82    30      12          0       0       1       1   30.625         5    30.625 |
                 13. |      2     83    31      12          0       0       1       1   30.625         6    30.625 |
                 14. |      2     85    33      12          0       0       1       1   30.625         7    30.625 |
                 15. |      2     87    35      12          0       0       1       1   30.625         8    30.625 |
                     |---------------------------------------------------------------------------------------------|
                 16. |      2     88    37      12          0       0       1       1   30.625         9    30.625 |
                     +---------------------------------------------------------------------------------------------+
              
              *Panel-specific mean for -age-, 1st observation not incuded (idcode==1)
              . di (25+28+31+33+35+37)/6
              31.5
              *Panel-specific mean for -age-, 1st observation not incuded (idcode==2)
              . di (25+26+28+30+31+33+35+37)/8
              30.625
              
              .
              Therefore, the main question is: is that the panel-specific mean you're looking for?
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment

              Working...
              X