Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Question on generating a new variable

    Hello all,

    I have a simple question about generating a new variable in the sample data below.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str1 firm int year byte(var1 var2)
    "A" 2000 . 1
    "A" 2001 5 1
    "A" 2002 5 1
    "A" 2003 5 0
    "A" 2004 5 0
    "A" 2005 5 0
    "A" 2006 5 0
    "A" 2007 5 0
    "A" 2008 5 0
    "A" 2009 5 0
    "A" 2010 5 0
    "A" 2011 5 0
    "A" 2012 5 0
    "A" 2013 5 0
    "A" 2014 5 0
    "B" 2000 . 1
    "B" 2001 . 1
    "B" 2002 . 1
    "B" 2003 3 0
    "B" 2004 4 0
    "B" 2005 4 0
    "B" 2006 4 0
    "B" 2007 4 0
    "B" 2008 4 0
    "B" 2009 4 0
    "B" 2010 4 0
    "B" 2011 4 0
    "B" 2012 4 0
    "B" 2013 4 0
    "B" 2014 4 0
    "C" 2000 . 1
    "C" 2001 . 1
    "C" 2002 5 1
    "C" 2003 5 0
    "C" 2004 6 0
    "C" 2005 6 0
    "C" 2006 6 0
    "C" 2007 6 0
    "C" 2008 6 0
    "C" 2009 6 0
    "C" 2010 6 0
    "C" 2011 6 0
    "C" 2012 6 0
    "C" 2013 6 0
    "C" 2014 6 0
    end

    How to create a new variable that contains the first non-missing value of var1 when var2==1. For firm A above, the value of the new variable should be 5 for all years. The new variable takes a missing value if all var1 is missing when var2==1. I appreciate your help, I spent too long figuring this out.

    Best regards,

    Abdan

  • #2
    Two ways to do it:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str1 firm int year byte(var1 var2)
    "A" 2000 . 1
    "A" 2001 5 1
    "A" 2002 5 1
    "A" 2003 5 0
    "A" 2004 5 0
    "A" 2005 5 0
    "A" 2006 5 0
    "A" 2007 5 0
    "A" 2008 5 0
    "A" 2009 5 0
    "A" 2010 5 0
    "A" 2011 5 0
    "A" 2012 5 0
    "A" 2013 5 0
    "A" 2014 5 0
    "B" 2000 . 1
    "B" 2001 . 1
    "B" 2002 . 1
    "B" 2003 3 0
    "B" 2004 4 0
    "B" 2005 4 0
    "B" 2006 4 0
    "B" 2007 4 0
    "B" 2008 4 0
    "B" 2009 4 0
    "B" 2010 4 0
    "B" 2011 4 0
    "B" 2012 4 0
    "B" 2013 4 0
    "B" 2014 4 0
    "C" 2000 . 1
    "C" 2001 . 1
    "C" 2002 5 1
    "C" 2003 5 0
    "C" 2004 6 0
    "C" 2005 6 0
    "C" 2006 6 0
    "C" 2007 6 0
    "C" 2008 6 0
    "C" 2009 6 0
    "C" 2010 6 0
    "C" 2011 6 0
    "C" 2012 6 0
    "C" 2013 6 0
    "C" 2014 6 0
    end
    
    bysort firm: egen when = min(cond(var2 == 1 &  var1 < . , year, .))
    
    by firm: egen what = mean(cond(year == when, var1, .)) 
    
    gen WHAT = var1 if var2 == 1 
    
    bysort firm (year) : replace WHAT = WHAT[_n-1] if WHAT[_n-1] < . 
    
    bysort firm (WHAT) : replace WHAT = WHAT[1]
    
    sort firm year
    
    list, sepby(firm)
         +------------------------------------------------+
         | firm   year   var1   var2   when   what   WHAT |
         |------------------------------------------------|
      1. |    A   2000      .      1   2001      5      5 |
      2. |    A   2001      5      1   2001      5      5 |
      3. |    A   2002      5      1   2001      5      5 |
      4. |    A   2003      5      0   2001      5      5 |
      5. |    A   2004      5      0   2001      5      5 |
      6. |    A   2005      5      0   2001      5      5 |
      7. |    A   2006      5      0   2001      5      5 |
      8. |    A   2007      5      0   2001      5      5 |
      9. |    A   2008      5      0   2001      5      5 |
     10. |    A   2009      5      0   2001      5      5 |
     11. |    A   2010      5      0   2001      5      5 |
     12. |    A   2011      5      0   2001      5      5 |
     13. |    A   2012      5      0   2001      5      5 |
     14. |    A   2013      5      0   2001      5      5 |
     15. |    A   2014      5      0   2001      5      5 |
         |------------------------------------------------|
     16. |    B   2000      .      1      .      .      . |
     17. |    B   2001      .      1      .      .      . |
     18. |    B   2002      .      1      .      .      . |
     19. |    B   2003      3      0      .      .      . |
     20. |    B   2004      4      0      .      .      . |
     21. |    B   2005      4      0      .      .      . |
     22. |    B   2006      4      0      .      .      . |
     23. |    B   2007      4      0      .      .      . |
     24. |    B   2008      4      0      .      .      . |
     25. |    B   2009      4      0      .      .      . |
     26. |    B   2010      4      0      .      .      . |
     27. |    B   2011      4      0      .      .      . |
     28. |    B   2012      4      0      .      .      . |
     29. |    B   2013      4      0      .      .      . |
     30. |    B   2014      4      0      .      .      . |
         |------------------------------------------------|
     31. |    C   2000      .      1   2002      5      5 |
     32. |    C   2001      .      1   2002      5      5 |
     33. |    C   2002      5      1   2002      5      5 |
     34. |    C   2003      5      0   2002      5      5 |
     35. |    C   2004      6      0   2002      5      5 |
     36. |    C   2005      6      0   2002      5      5 |
     37. |    C   2006      6      0   2002      5      5 |
     38. |    C   2007      6      0   2002      5      5 |
     39. |    C   2008      6      0   2002      5      5 |
     40. |    C   2009      6      0   2002      5      5 |
     41. |    C   2010      6      0   2002      5      5 |
     42. |    C   2011      6      0   2002      5      5 |
     43. |    C   2012      6      0   2002      5      5 |
     44. |    C   2013      6      0   2002      5      5 |
     45. |    C   2014      6      0   2002      5      5 |
         +------------------------------------------------+
    The first solution follows Section 9 in https://journals.sagepub.com/doi/pdf...867X1101100210

    Comment


    • #3
      Thank you very much, Nick!

      Comment

      Working...
      X