Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create a variables based on consecutive years of observation

    Dear Stata Users,
    Please, help me to resolve the following issue. Below is the sample of data that I have, where “gvkey” is firm identifier, year is “fyear” and “tag5” is the number of consecutive years during which a firm was exporting its products. I need to create the following variables – firm_1, firm_2, firm_3, firm_4 and firm_5. Where “firm_1” is equal to “1” if a firm has one consecutive year of “tag5”. For example, for gvkey= 001013 “firm_1”==1 for year 2010 and zero otherwise. For gvkey = 001076 “firm_2” is equal to 1 for years=2004 and 2005. Firm_3 is for three consecutive years of “tag5” (1,2,3), Firm_4 – for 4 years of tag, Firm_5 for 5 and more consecutive years of tag5 (1,2,3,4,5,6,7,..etc). Please, help me resolve this issue.

    The final answer is expected to look like in the last sample of data:



    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str6 gvkey double fyear float tag5
    "001013" 1990  .
    "001013" 1991  .
    "001013" 1992  1
    "001013" 1993  2
    "001013" 1994  3
    "001013" 1995  4
    "001013" 1996  .
    "001013" 1997  .
    "001013" 1998  .
    "001013" 1999  .
    "001013" 2000  .
    "001013" 2001  .
    "001013" 2002  .
    "001013" 2003  .
    "001013" 2004  .
    "001013" 2005  .
    "001013" 2006  .
    "001013" 2007  .
    "001013" 2008  .
    "001013" 2009  .
    "001013" 2010  1
    "001076" 1994  .
    "001076" 1998  .
    "001076" 2004  1
    "001076" 2005  2
    "001076" 2006  .
    "001076" 2007  .
    "001076" 2008  .
    "001078" 1990  1
    "001078" 1991  2
    "001078" 1992  3
    "001078" 1993  4
    "001078" 1994  5
    "001078" 1995  6
    "001078" 1996  7
    "001078" 1997  8
    "001078" 1998  9
    "001078" 1999 10
    "001078" 2000  .
    "001078" 2001  .
    "001078" 2002  .
    "001078" 2003  1
    "001078" 2004  2
    "001078" 2005  3
    end


    ANSWER:



    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str6 gvkey double fyear float(tag5 firm_1 firm_2 firm_3 firm_4 firm_5)
    "001013" 1990  . 0 0 0 0 0
    "001013" 1991  . 0 0 0 0 0
    "001013" 1992  1 0 0 0 1 0
    "001013" 1993  2 0 0 0 1 0
    "001013" 1994  3 0 0 0 1 0
    "001013" 1995  4 0 0 0 1 0
    "001013" 1996  . 0 0 0 0 0
    "001013" 1997  . 0 0 0 0 0
    "001013" 1998  . 0 0 0 0 0
    "001013" 1999  . 0 0 0 0 0
    "001013" 2000  . 0 0 0 0 0
    "001013" 2001  . 0 0 0 0 0
    "001013" 2002  . 0 0 0 0 0
    "001013" 2003  . 0 0 0 0 0
    "001013" 2004  . 0 0 0 0 0
    "001013" 2005  . 0 0 0 0 0
    "001013" 2006  . 0 0 0 0 0
    "001013" 2007  . 0 0 0 0 0
    "001013" 2008  . 0 0 0 0 0
    "001013" 2009  . 0 0 0 0 0
    "001013" 2010  1 1 0 0 0 0
    "001076" 1994  . 0 0 0 0 0
    "001076" 1998  . 0 0 0 0 0
    "001076" 2004  1 0 1 0 0 0
    "001076" 2005  2 0 1 0 0 0
    "001076" 2006  . 0 0 0 0 0
    "001076" 2007  . 0 0 0 0 0
    "001076" 2008  . 0 0 0 0 0
    "001078" 1990  1 0 0 0 0 1
    "001078" 1991  2 0 0 0 0 1
    "001078" 1992  3 0 0 0 0 1
    "001078" 1993  4 0 0 0 0 1
    "001078" 1994  5 0 0 0 0 1
    "001078" 1995  6 0 0 0 0 1
    "001078" 1996  7 0 0 0 0 1
    "001078" 1997  8 0 0 0 0 1
    "001078" 1998  9 0 0 0 0 1
    "001078" 1999 10 0 0 0 0 1
    "001078" 2000  . 0 0 0 0 0
    "001078" 2001  . 0 0 0 0 0
    "001078" 2002  . 0 0 0 0 0
    "001078" 2003  1 0 0 1 0 0
    "001078" 2004  2 0 0 1 0 0
    "001078" 2005  3 0 0 1 0 0
    end
    Last edited by Alberto Alvarez; 21 May 2019, 07:54.

  • #2
    Code:
    levelsof gvkey, local(firms)
    local i=1
    foreach f in `firms'{
    gen firm_`i'=!missing(tag5)
    local ++i
    }

    Comment


    • #3
      Thank you for the answer! I tried your code, and it gives me a bit different results from what I expected to get. For example, for “firm_1” it should equal to 1 when “tag5” is non missing for only one consecutive observation. I updated my post with the expected solution that I aim to get. Please, can you elaborate a bit your solution.

      Comment


      • #4
        I think this may do what you want.
        Code:
        bysort gvkey (fyear): generate spell = sum( missing(tag5[_n-1]) & !missing(tag5) )
        replace spell = . if tag5==.
        bysort gvkey spell: generate firm_m = _N
        drop spell
        sort gvkey fyear
        forvalues i=1/5 {
            generate firm_`i' = firm_m==`i'
            }
        replace firm_5 = 1 if !missing(firm_m) & firm_m>5
        drop firm_m
        list, noobs sepby(gvkey)
        Code:
        . list, noobs sepby(gvkey)
        
          +--------------------------------------------------------------------+
          |  gvkey   fyear   tag5   firm_1   firm_2   firm_3   firm_4   firm_5 |
          |--------------------------------------------------------------------|
          | 001013    1990      .        0        0        0        0        1 |
          | 001013    1991      .        0        0        0        0        1 |
          | 001013    1992      1        0        0        0        1        0 |
          | 001013    1993      2        0        0        0        1        0 |
          | 001013    1994      3        0        0        0        1        0 |
          | 001013    1995      4        0        0        0        1        0 |
          | 001013    1996      .        0        0        0        0        1 |
          | 001013    1997      .        0        0        0        0        1 |
          | 001013    1998      .        0        0        0        0        1 |
          | 001013    1999      .        0        0        0        0        1 |
          | 001013    2000      .        0        0        0        0        1 |
          | 001013    2001      .        0        0        0        0        1 |
          | 001013    2002      .        0        0        0        0        1 |
          | 001013    2003      .        0        0        0        0        1 |
          | 001013    2004      .        0        0        0        0        1 |
          | 001013    2005      .        0        0        0        0        1 |
          | 001013    2006      .        0        0        0        0        1 |
          | 001013    2007      .        0        0        0        0        1 |
          | 001013    2008      .        0        0        0        0        1 |
          | 001013    2009      .        0        0        0        0        1 |
          | 001013    2010      1        1        0        0        0        0 |
          |--------------------------------------------------------------------|
          | 001076    1994      .        0        0        0        0        1 |
          | 001076    1998      .        0        0        0        0        1 |
          | 001076    2004      1        0        1        0        0        0 |
          | 001076    2005      2        0        1        0        0        0 |
          | 001076    2006      .        0        0        0        0        1 |
          | 001076    2007      .        0        0        0        0        1 |
          | 001076    2008      .        0        0        0        0        1 |
          |--------------------------------------------------------------------|
          | 001078    1990      1        0        0        0        0        1 |
          | 001078    1991      2        0        0        0        0        1 |
          | 001078    1992      3        0        0        0        0        1 |
          | 001078    1993      4        0        0        0        0        1 |
          | 001078    1994      5        0        0        0        0        1 |
          | 001078    1995      6        0        0        0        0        1 |
          | 001078    1996      7        0        0        0        0        1 |
          | 001078    1997      8        0        0        0        0        1 |
          | 001078    1998      9        0        0        0        0        1 |
          | 001078    1999     10        0        0        0        0        1 |
          | 001078    2000      .        0        0        1        0        0 |
          | 001078    2001      .        0        0        1        0        0 |
          | 001078    2002      .        0        0        1        0        0 |
          | 001078    2003      1        0        0        1        0        0 |
          | 001078    2004      2        0        0        1        0        0 |
          | 001078    2005      3        0        0        1        0        0 |
          +--------------------------------------------------------------------+

        Comment


        • #5
          Your earlier description was not clear. Here is another way:

          Code:
          gen order=_n
          gen tag= !missing(tag5)
          replace tag = sum(tag) if tag5==1 & tag==1
          replace tag=tag[_n-1] if tag!=0 & tag[_n-1]!=0
          bysort tag: egen tag2=max(tag5)
          sort order
          forval i=1/4{
          gen firm_`i'=tag2==`i'
          }
          gen firm5=tag2>=5 & !missing(tag2)
          Result:

          Code:
          
          . l, sepby(gvkey)
          
               +----------------------------------------------------------------------------------------+
               |  gvkey   fyear   tag5   order   tag   tag2   firm_1   firm_2   firm_3   firm_4   firm5 |
               |----------------------------------------------------------------------------------------|
            1. | 001013    1990      .       1     0      .        0        0        0        0       0 |
            2. | 001013    1991      .       2     0      .        0        0        0        0       0 |
            3. | 001013    1992      1       3     1      4        0        0        0        1       0 |
            4. | 001013    1993      2       4     1      4        0        0        0        1       0 |
            5. | 001013    1994      3       5     1      4        0        0        0        1       0 |
            6. | 001013    1995      4       6     1      4        0        0        0        1       0 |
            7. | 001013    1996      .       7     0      .        0        0        0        0       0 |
            8. | 001013    1997      .       8     0      .        0        0        0        0       0 |
            9. | 001013    1998      .       9     0      .        0        0        0        0       0 |
           10. | 001013    1999      .      10     0      .        0        0        0        0       0 |
           11. | 001013    2000      .      11     0      .        0        0        0        0       0 |
           12. | 001013    2001      .      12     0      .        0        0        0        0       0 |
           13. | 001013    2002      .      13     0      .        0        0        0        0       0 |
           14. | 001013    2003      .      14     0      .        0        0        0        0       0 |
           15. | 001013    2004      .      15     0      .        0        0        0        0       0 |
           16. | 001013    2005      .      16     0      .        0        0        0        0       0 |
           17. | 001013    2006      .      17     0      .        0        0        0        0       0 |
           18. | 001013    2007      .      18     0      .        0        0        0        0       0 |
           19. | 001013    2008      .      19     0      .        0        0        0        0       0 |
           20. | 001013    2009      .      20     0      .        0        0        0        0       0 |
           21. | 001013    2010      1      21     2      1        1        0        0        0       0 |
               |----------------------------------------------------------------------------------------|
           22. | 001076    1994      .      22     0      .        0        0        0        0       0 |
           23. | 001076    1998      .      23     0      .        0        0        0        0       0 |
           24. | 001076    2004      1      24     3      2        0        1        0        0       0 |
           25. | 001076    2005      2      25     3      2        0        1        0        0       0 |
           26. | 001076    2006      .      26     0      .        0        0        0        0       0 |
           27. | 001076    2007      .      27     0      .        0        0        0        0       0 |
           28. | 001076    2008      .      28     0      .        0        0        0        0       0 |
               |----------------------------------------------------------------------------------------|
           29. | 001078    1990      1      29     4     10        0        0        0        0       1 |
           30. | 001078    1991      2      30     4     10        0        0        0        0       1 |
           31. | 001078    1992      3      31     4     10        0        0        0        0       1 |
           32. | 001078    1993      4      32     4     10        0        0        0        0       1 |
           33. | 001078    1994      5      33     4     10        0        0        0        0       1 |
           34. | 001078    1995      6      34     4     10        0        0        0        0       1 |
           35. | 001078    1996      7      35     4     10        0        0        0        0       1 |
           36. | 001078    1997      8      36     4     10        0        0        0        0       1 |
           37. | 001078    1998      9      37     4     10        0        0        0        0       1 |
           38. | 001078    1999     10      38     4     10        0        0        0        0       1 |
           39. | 001078    2000      .      39     0      .        0        0        0        0       0 |
           40. | 001078    2001      .      40     0      .        0        0        0        0       0 |
           41. | 001078    2002      .      41     0      .        0        0        0        0       0 |
           42. | 001078    2003      1      42     5      3        0        0        1        0       0 |
           43. | 001078    2004      2      43     5      3        0        0        1        0       0 |
           44. | 001078    2005      3      44     5      3        0        0        1        0       0 |
               +----------------------------------------------------------------------------------------+
          Last edited by Andrew Musau; 21 May 2019, 10:13.

          Comment


          • #6
            Thank you! It works perfectly!

            Comment

            Working...
            X