Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sum(), total(), loop?

    Hi all,

    Click image for larger version

Name:	stata.jpg
Views:	1
Size:	34.5 KB
ID:	1668044


    the attached picture shows data for one individual (the personal identifier is the same for all rows). The first column denotes the survey year and the second column is a dummy which takes the value 1 if someone lost his or her job in the respective year.

    I would like to create a variable which sums over the past columns for each individual. So for the individual from the picture, the new variable should be . in year 1984, 1 from 1985-88, 2 from 1989-94 and 3 from 1985 onwards.

    Does anyone has an idea how to do that? ;-)

    Best,
    Kathrin

  • #2
    Code:
    bys id (year): gen wanted= sum(dummy)

    where you replace "id", "year" and "dummy" with their corresponding names in your dataset. You already have 20 posts on Statalist, so you are not new here. Review FAQ Advice #12 on how to provide data examples using the dataex command for your future posts.

    Comment


    • #3
      Hi Andrew, thank you for your answer (and sorry, I haven't checked the Advices beforehand).

      I actually have a follow-up question. I am trying to create an "event" variable, reflecting the relative time since job loss.

      Using this code

      Code:
      set         more off
      xtset         pid syear 
      
      gen         treat_year = . 
      replace     treat_year = syear if jobloss_invol == 1 
      by pid:     egen treatyear_min = min(treat_year)
      
      * Consider only the first job loss 
      cap drop     event0 
      gen         event0 = . 
      replace     event0 = syear - treatyear_min if sample == 1 
      
      cap drop     event1 
      gen         event1 = . 
      replace     event1 = event0
      
      sort         pid syear 
      forvalues i=1/20{
      by pid:        replace event1 = event1[_n-1] +1 if missing(event1) & event1[_n-1] >= 0 
      by pid:        replace event1 = event1[_n+1] -1 if missing(event1) & event1[_n+1] <= 0 
      }
      
      cap drop     event2
      gen         event2 = . 
      replace     event2 = event1 
      replace     event2 = -3 if event2 <= -4
      replace     event2 = 5 if event2 >= 6 & event2 < .
      I managed to create event0, event1 and event2 of the following table:

      Click image for larger version

Name:	stata.jpg
Views:	1
Size:	124.0 KB
ID:	1668068



      Now, I would like to create event3.

      event3 should replace event2 by a missing value, if in the respective year the individual loses the job for the second time and if the last job loss is less than 5 years ago. It should also be missing in all consecutive periods, unless there exists another job loss, which occurs at least 5 years after the last job loss has occured (e.g. event3 should take the value 0 for the job loss in 1995, as the last job loss was in 1990). Event3 should then be replaced by -1 in the year before the job loss, and should start counting again for the periods afterwards.

      In case there is a third job loss which is more than 5 years ago, event3 should again be zero in the respective year, start counting from then onwards, and replaces the year before job loss by -1.

      Still all event3 values larger than 5 should be replaced by 5 and all values smaller than -3 should be replaced by -3. The table shows an example of I want it to look like...

      It would be a great help, if you could have a look at it!

      Best,
      Katrin

      Comment


      • #4
        Copying and pasting the result of

        Code:
        dataex id year jobloss wanted event* if id==1

        will allow myself or others to attempt a solution. Also, what do you wish to accomplish? It is better to work from the final result that you want rather than an intermediate step. If you can fill this in manually for id==1, it will help to resolve your problem efficiently.
        Last edited by Andrew Musau; 06 Jun 2022, 10:01.

        Comment


        • #5
          I had to take a new example, but it follows the same principle:

          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input double(pid syear) float(jobloss_invol treat_year treatyear_min event0 event1 event2 event3)
          62803 1984 .    . 1997  . -13 -3 -3
          62803 1985 .    . 1997  . -12 -3 -3
          62803 1986 .    . 1997  . -11 -3 -3
          62803 1987 .    . 1997  . -10 -3 -3
          62803 1988 .    . 1997  .  -9 -3 -3
          62803 1989 .    . 1997  .  -8 -3 -3
          62803 1990 .    . 1997  .  -7 -3 -3
          62803 1991 0    . 1997 -6  -6 -3 -3
          62803 1992 0    . 1997 -5  -5 -3 -3
          62803 1993 0    . 1997 -4  -4 -3 -3
          62803 1994 0    . 1997 -3  -3 -3 -3
          62803 1995 0    . 1997 -2  -2 -2 -2
          62803 1996 0    . 1997 -1  -1 -1 -1
          62803 1997 1 1997 1997  0   0  0 0
          62803 1998 0    . 1997  1   1  1 1
          62803 1999 1 1999 1997  2   2  2 .
          62803 2000 0    . 1997  3   3  3 .
          62803 2001 0    . 1997  4   4  4 .
          62803 2002 .    . 1997  .   5  5 .
          62803 2003 0    . 1997  6   6  5 .
          62803 2004 0    . 1997  7   7  5 .
          62803 2005 0    . 1997  8   8  5 .
          62803 2006 0    . 1997  9   9  5 .
          62803 2007 0    . 1997 10  10  5 -1
          62803 2008 1    . 1997 11  11  5 0
          62803 2009 0    . 1997 12  12  5 1
          62803 2010 0    . 1997 13  13  5 2
          62803 2011 0    . 1997 14  14  5 3
          62803 2012 0    . 1997 15  15  5 -1
          62803 2013 1 2012 1997 16  16  5 0
          62803 2014 0    . 1997 17  17  5 1
          end
          label values pid pid
          label values syear syear
          Also, I noticed that your suggestion for the "wanted" variable does not seem to work properly. I just found this
          Click image for larger version

Name:	stata.jpg
Views:	1
Size:	124.0 KB
ID:	1668078







          in the dataset.

          Thanks already now for your help!
          Last edited by Kathrin Me; 06 Jun 2022, 10:36.

          Comment


          • #6
            I had to correct it once more, now the example should be right.

            Comment


            • #7
              In case there is a third job loss which is more than 5 years ago, event3 should again be zero in the respective year, start counting from then onwards, and replaces the year before job loss by -1.
              Why does the counter reset when the last job loss from the perspective of 2013 was exactly 5 years ago in 2008?


              62803 2007 0 . 1997 10 10 5 -1
              62803 2008 1 . 1997 11 11 5 0
              62803 2009 0 . 1997 12 12 5 1
              62803 2010 0 . 1997 13 13 5 2
              62803 2011 0 . 1997 14 14 5 3
              62803 2012 0 . 1997 15 15 5 -1
              62803 2013 1 2012 1997 16 16 5 0
              62803 2014 0 . 1997 17 17 5 1
              end
              Also, I noticed that your suggestion for the "wanted" variable does not seem to work properly. I just found this
              This is not reproduced with your data example above. You can provide an alternative example that exhibits the problem.

              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input double(pid syear) float(jobloss_invol treat_year treatyear_min event0 event1 event2 event3)
              62803 1984 .    . 1997  . -13 -3 -3
              62803 1985 .    . 1997  . -12 -3 -3
              62803 1986 .    . 1997  . -11 -3 -3
              62803 1987 .    . 1997  . -10 -3 -3
              62803 1988 .    . 1997  .  -9 -3 -3
              62803 1989 .    . 1997  .  -8 -3 -3
              62803 1990 .    . 1997  .  -7 -3 -3
              62803 1991 0    . 1997 -6  -6 -3 -3
              62803 1992 0    . 1997 -5  -5 -3 -3
              62803 1993 0    . 1997 -4  -4 -3 -3
              62803 1994 0    . 1997 -3  -3 -3 -3
              62803 1995 0    . 1997 -2  -2 -2 -2
              62803 1996 0    . 1997 -1  -1 -1 -1
              62803 1997 1 1997 1997  0   0  0  0
              62803 1998 0    . 1997  1   1  1  1
              62803 1999 1 1999 1997  2   2  2  .
              62803 2000 0    . 1997  3   3  3  .
              62803 2001 0    . 1997  4   4  4  .
              62803 2002 .    . 1997  .   5  5  .
              62803 2003 0    . 1997  6   6  5  .
              62803 2004 0    . 1997  7   7  5  .
              62803 2005 0    . 1997  8   8  5  .
              62803 2006 0    . 1997  9   9  5  .
              62803 2007 0    . 1997 10  10  5 -1
              62803 2008 1    . 1997 11  11  5  0
              62803 2009 0    . 1997 12  12  5  1
              62803 2010 0    . 1997 13  13  5  2
              62803 2011 0    . 1997 14  14  5  3
              62803 2013 1 2012 1997 16  16  5  0
              62803 2014 0    . 1997 17  17  5  1
              end
              label values pid pid
              label values syear syear
              
              bys pid (syear): gen wanted= sum(job)
              list pid syear job wanted, sepby(pid)
              Res.:

              Code:
              . list pid syear job wanted, sepby(pid)
              
                   +-----------------------------------+
                   |   pid   syear   joblos~l   wanted |
                   |-----------------------------------|
                1. | 62803    1984          .        0 |
                2. | 62803    1985          .        0 |
                3. | 62803    1986          .        0 |
                4. | 62803    1987          .        0 |
                5. | 62803    1988          .        0 |
                6. | 62803    1989          .        0 |
                7. | 62803    1990          .        0 |
                8. | 62803    1991          0        0 |
                9. | 62803    1992          0        0 |
               10. | 62803    1993          0        0 |
               11. | 62803    1994          0        0 |
               12. | 62803    1995          0        0 |
               13. | 62803    1996          0        0 |
               14. | 62803    1997          1        1 |
               15. | 62803    1998          0        1 |
               16. | 62803    1999          1        2 |
               17. | 62803    2000          0        2 |
               18. | 62803    2001          0        2 |
               19. | 62803    2002          .        2 |
               20. | 62803    2003          0        2 |
               21. | 62803    2004          0        2 |
               22. | 62803    2005          0        2 |
               23. | 62803    2006          0        2 |
               24. | 62803    2007          0        2 |
               25. | 62803    2008          1        3 |
               26. | 62803    2009          0        3 |
               27. | 62803    2010          0        3 |
               28. | 62803    2011          0        3 |
               29. | 62803    2013          1        4 |
               30. | 62803    2014          0        4 |
                   +-----------------------------------+
              
              .

              Comment


              • #8
                The counter should reset when it is 5 or more years ago. Maybe I will also change this number later to 6 (or more) or 7 (or more) years. I chose 5 just for the purpose of the example. The reason is that I will look at dynamic treatment effects in the (5) years after job loss and I want to make sure that the observation is not used for the pre-treatment period effect and for the post-treatment period effect at the same time.

                Also, you might be right with the wanted variable. Today, I cannot find anything wrong with the wanted variable. I am not sure why I found this example yesterday...

                Comment


                • #9
                  I still do not follow the logic, but this gives you the variable in your data example.

                  Code:
                  * Example generated by -dataex-. For more info, type help dataex
                  clear
                  input double(pid syear) float(jobloss_invol treat_year treatyear_min event0 event1 event2 event3)
                  62803 1984 .    . 1997  . -13 -3 -3
                  62803 1985 .    . 1997  . -12 -3 -3
                  62803 1986 .    . 1997  . -11 -3 -3
                  62803 1987 .    . 1997  . -10 -3 -3
                  62803 1988 .    . 1997  .  -9 -3 -3
                  62803 1989 .    . 1997  .  -8 -3 -3
                  62803 1990 .    . 1997  .  -7 -3 -3
                  62803 1991 0    . 1997 -6  -6 -3 -3
                  62803 1992 0    . 1997 -5  -5 -3 -3
                  62803 1993 0    . 1997 -4  -4 -3 -3
                  62803 1994 0    . 1997 -3  -3 -3 -3
                  62803 1995 0    . 1997 -2  -2 -2 -2
                  62803 1996 0    . 1997 -1  -1 -1 -1
                  62803 1997 1 1997 1997  0   0  0  0
                  62803 1998 0    . 1997  1   1  1  1
                  62803 1999 1 1999 1997  2   2  2  .
                  62803 2000 0    . 1997  3   3  3  .
                  62803 2001 0    . 1997  4   4  4  .
                  62803 2002 .    . 1997  .   5  5  .
                  62803 2003 0    . 1997  6   6  5  .
                  62803 2004 0    . 1997  7   7  5  .
                  62803 2005 0    . 1997  8   8  5  .
                  62803 2006 0    . 1997  9   9  5  .
                  62803 2007 0    . 1997 10  10  5 -1
                  62803 2008 1    . 1997 11  11  5  0
                  62803 2009 0    . 1997 12  12  5  1
                  62803 2010 0    . 1997 13  13  5  2
                  62803 2011 0    . 1997 14  14  5  3
                  62803 2013 1 2012 1997 16  16  5  0
                  62803 2014 0    . 1997 17  17  5  1
                  end
                  label values pid pid
                  label values syear syear
                  
                  
                  gen wanted= 0 if job==1
                  xtset pid syear
                  replace wanted=. if !l1.wanted|!l2.wanted|!l3.wanted|!l4.wanted
                  replace wanted= l.wanted+1 if !missing(l.wanted) & missing(wanted) 
                  replace wanted=-1 if f.job==1 & !f.wanted
                  bys pid (syear): g seq= sum(wanted==0)
                  bys pid seq: replace wanted=. if _N>5 & wanted>=2
                  gsort pid -syear
                  by pid: replace wanted= cond(wanted[_n-1]<=-3, -3, wanted[_n-1]-1) if missing(wanted) & !missing(wanted[_n-1]) &!seq
                  sort pid syear
                  Res.:

                  Code:
                  . sort pid syear
                  
                  . l pid syear jobloss_invol treat_year event3 wanted, sepby(pid)
                  
                       +-------------------------------------------------------+
                       |   pid   syear   joblos~l   treat_~r   event3   wanted |
                       |-------------------------------------------------------|
                    1. | 62803    1984          .          .       -3       -3 |
                    2. | 62803    1985          .          .       -3       -3 |
                    3. | 62803    1986          .          .       -3       -3 |
                    4. | 62803    1987          .          .       -3       -3 |
                    5. | 62803    1988          .          .       -3       -3 |
                    6. | 62803    1989          .          .       -3       -3 |
                    7. | 62803    1990          .          .       -3       -3 |
                    8. | 62803    1991          0          .       -3       -3 |
                    9. | 62803    1992          0          .       -3       -3 |
                   10. | 62803    1993          0          .       -3       -3 |
                   11. | 62803    1994          0          .       -3       -3 |
                   12. | 62803    1995          0          .       -2       -2 |
                   13. | 62803    1996          0          .       -1       -1 |
                   14. | 62803    1997          1       1997        0        0 |
                   15. | 62803    1998          0          .        1        1 |
                   16. | 62803    1999          1       1999        .        . |
                   17. | 62803    2000          0          .        .        . |
                   18. | 62803    2001          0          .        .        . |
                   19. | 62803    2002          .          .        .        . |
                   20. | 62803    2003          0          .        .        . |
                   21. | 62803    2004          0          .        .        . |
                   22. | 62803    2005          0          .        .        . |
                   23. | 62803    2006          0          .        .        . |
                   24. | 62803    2007          0          .       -1       -1 |
                   25. | 62803    2008          1          .        0        0 |
                   26. | 62803    2009          0          .        1        1 |
                   27. | 62803    2010          0          .        2        2 |
                   28. | 62803    2011          0          .        3        3 |
                   29. | 62803    2013          1       2012        0        0 |
                   30. | 62803    2014          0          .        1        1 |
                       +-------------------------------------------------------+
                  
                  .

                  Comment


                  • #10
                    Thanks, I managed now

                    Comment

                    Working...
                    X