Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Identify first non zero value, create new variable

    Hi All,

    My data is individual level data. There are a series of variables, with "date" and then six numbers representing the month and year. So for example, date011990 means January of 1990, date 021990 mean February 1990, etc.

    For an individual, these usually start with zeros, and at some point the value is not zero, which is what I am interested in (month and year payment first started being received). I want to locate the first time the value is not zero, and pull out that information from the variable name.

    ID date011990 date021990 ..... date041995 date051995 date061995 ..... date121997 date011998 date021998
    70 0 0 0 940 940 1010 1010 1010
    71 0 750 830 830 830 890 890 890
    72 0 0 0 0 0 0 1230 1230

    For each ID, I want two to create new variables "Month" and "Year"

    Person 70, month = 05, year =1995
    Person 71 month = 02, year = 1990
    Person 72 month = 01, year =1998

    This is the end product I want:


    ID date011990 date021990 ..... date041995 date051995 date061995 ..... date121997 date011998 date021998 month year
    70 0 0 0 940 940 1010 1010 1010 05 1995
    71 0 750 830 830 830 890 890 890 02 1990
    72 0 0 0 0 0 0 1230 1230 01 1998

    Thanks!

  • #2
    The formatting of my dates and values got messed up above

    What I have:
    ID date011990 date021990 ... date041995 date051995 date061995 ... date121997 date011998 date021998
    70 0 0 0 940 940 1010 1010 1010
    71 0 750 830 830 830 890 890 890
    72 0 0 0 0 0 0 1230 1230
    What I want:
    ID date011990 date021990 ... date041995 date051995 date061995 ... date121997 date011998 date021998 month year
    70 0 0 0 940 940 1010 1010 1010 05 1995
    71 0 750 830 830 830 890 890 890 02 1990
    72 0 0 0 0 0 0 1230 1230 01 1998

    Comment


    • #3
      Please use dataex to present your data.

      Comment


      • #4
        Jared Greathouse my data is restricted, so the above example was just quickly created by me to demonstrate what I want to do.

        Comment


        • #5
          Note that the issue of confidential data is already addressed in FAQ Advice #12, and does not absolve you from the requirement to present reproducible data examples using the dataex command. reshape long the data, create a time variable recognized by Stata and proceed as below. You could also loop in wide layout assuming the variable order reflects the dates, but I choose not to illustrate such a solution as data management is for the most part efficient in long layout.

          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input byte company int year byte value
          1 2000  0
          1 2001  0
          1 2002 10
          1 2003 11
          1 2004 15
          2 2000  0
          2 2001  0
          2 2002  0
          2 2003  0
          2 2004 12
          3 2000 0
          3 2001 18
          3 2002 17
          3 2003 15
          3 2004 19
          end
          bys company (year): gen wanted= cond(value!=0, year, .)
          bys company (wanted): replace wanted= wanted[1]
          Res.:

          Code:
          . sort c y
          
          . l, sepby(c)
          
               +---------------------------------+
               | company   year   value   wanted |
               |---------------------------------|
            1. |       1   2000       0     2002 |
            2. |       1   2001       0     2002 |
            3. |       1   2002      10     2002 |
            4. |       1   2003      11     2002 |
            5. |       1   2004      15     2002 |
               |---------------------------------|
            6. |       2   2000       0     2004 |
            7. |       2   2001       0     2004 |
            8. |       2   2002       0     2004 |
            9. |       2   2003       0     2004 |
           10. |       2   2004      12     2004 |
               |---------------------------------|
           11. |       3   2000       0     2001 |
           12. |       3   2001      18     2001 |
           13. |       3   2002      17     2001 |
           14. |       3   2003      15     2001 |
           15. |       3   2004      19     2001 |
               +---------------------------------+

          Comment


          • #6
            Another way to do it:

            Code:
            bysort company: egen wanted = min(cond(value != 0, year, .))
            For discussion,

            Code:
            search dm0055, entry
            and see Section 9 of that paper.
            Last edited by Nick Cox; 06 Jul 2022, 00:01.

            Comment

            Working...
            X