Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • switch v1 and v2 conditioning on ...

    Dear All, I found this question here (http://bbs.pinggu.org/forum.php?mod=...=1#pid52040796). Suppose that the data set is
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str4 fdschool2 str1 fdschool3 int(fdsch2begin fdsch3begin)
    "A" "B" 15769 17776
    ""  ""      .     .
    ""  ""      .     .
    "C" "D" 17410 19418
    "E" ""  17045     .
    "F" ""  15223     .
    "G" "H" 11932 13027
    ""  "I"     . 15027
    end
    format %tdnn/dd/CCYY fdsch2begin
    format %tdnn/dd/CCYY fdsch3begin
    1. if `fdsch2begin' > `fdsch3begin', then we switch `fdschool2' and `fdschool3'.
    2. if only `fdsch2begin' is missing, then replace `fdsch2begin' and `fdschool2' with `fdsch3begin' and `fdschool3', respectively.
    3. if only `fdsch3begin' is missing, don't do anything.
    4. if both `fdsch2begin' and `fdsch3begin' are missing, don't do anything.
    Any suggestion is highly appreciated.
    Ho-Chuan (River) Huang
    Stata 19.0, MP(4)

  • #2
    There are various solutions here. One is to see that you'd be better off with a long layout (data structure):

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str4 fdschool2 str1 fdschool3 int(fdsch2begin fdsch3begin)
    "A" "B" 15769 17776
    ""  ""      .     .
    ""  ""      .     .
    "C" "D" 17410 19418
    "E" ""  17045     .
    "F" ""  15223     .
    "G" "H" 11932 13027
    ""  "I"     . 15027
    end
    format %tdnn/dd/CCYY fdsch2begin
    format %tdnn/dd/CCYY fdsch3begin
    
    gen long id = _n
    
    rename (fdsch*begin) (fdschbegin*)
    
    reshape long fdschool fdschbegin, i(id) j(seq)
    
    bysort id (fdschbegin) : replace seq = _n - 1
    
    list, sepby(id)
    
         +---------------------------------+
         | id   seq   fdschool   fdschbe~n |
         |---------------------------------|
      1. |  1     0          A    3/5/2003 |
      2. |  1     1          B    9/1/2008 |
         |---------------------------------|
      3. |  2     0                      . |
      4. |  2     1                      . |
         |---------------------------------|
      5. |  3     0                      . |
      6. |  3     1                      . |
         |---------------------------------|
      7. |  4     0          C    9/1/2007 |
      8. |  4     1          D    3/1/2013 |
         |---------------------------------|
      9. |  5     0          E    9/1/2006 |
     10. |  5     1                      . |
         |---------------------------------|
     11. |  6     0          F    9/5/2001 |
     12. |  6     1                      . |
         |---------------------------------|
     13. |  7     0          G    9/1/1992 |
     14. |  7     1          H    9/1/1995 |
         |---------------------------------|
     15. |  8     0          I   2/21/2001 |
     16. |  8     1                      . |
         +---------------------------------+
    I am going to recommend that solution.

    Otherwise, you have pretty much written the recipe, so the coding follows, say

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str4 fdschool2 str1 fdschool3 int(fdsch2begin fdsch3begin)
    "A" "B" 15769 17776
    ""  ""      .     .
    ""  ""      .     .
    "C" "D" 17410 19418
    "E" ""  17045     .
    "F" ""  15223     .
    "G" "H" 11932 13027
    ""  "I"     . 15027
    end
    format %tdnn/dd/CCYY fdsch2begin
    format %tdnn/dd/CCYY fdsch3begin
    
    gen FDSCHOOL2 = cond(fdsch2begin < fdsch3begin, fdschool2, fdschool3)
    gen FDSCHOOL3 = cond(fdsch2begin < fdsch3begin, fdschool3, fdschool2)
    gen FDSC2BEGIN = cond(fdsch2begin < fdsch3begin, fdsch2begin, fdsch3begin)
    gen FDSC3BEGIN = cond(fdsch2begin < fdsch3begin, fdsch3begin, fdsch2begin)
    
    drop fd*
    rename FD*, lower
    format *begin %tdnn/dd/CCYY
    
    list
    
         +--------------------------------------------+
         | fdscho~2   fdscho~3   fdsc2be~n   fdsc3b~n |
         |--------------------------------------------|
      1. |        A          B    3/5/2003   9/1/2008 |
      2. |                               .          . |
      3. |                               .          . |
      4. |        C          D    9/1/2007   3/1/2013 |
      5. |        E               9/1/2006          . |
         |--------------------------------------------|
      6. |        F               9/5/2001          . |
      7. |        G          H    9/1/1992   9/1/1995 |
      8. |        I          I   2/21/2001          . |
         +--------------------------------------------+
    See also swapval (SSC).


    EDIT: Fixed typo as notified in #3.
    Last edited by Nick Cox; 04 Jul 2018, 03:33.

    Comment


    • #3
      Dear Nick, Many thanks for the suggestions. 1. For the first solution, I guess I need to reshape `wide' back. 2. For the second solution, a typo appears as
      Code:
      gen FDSCHOOL3 = cond(fdsch2begin < fdsch3begin, fdschool3, fdschool"2")
      Ho-Chuan (River) Huang
      Stata 19.0, MP(4)

      Comment


      • #4
        You're quite right about the typo. Thanks, and fixed.

        But no: Usually you're better off without reshape wide as the long layout is the Stata standard for longitudinal data.

        Comment


        • #5
          Dear Nick, Thanks for the confirmation and suggestion.
          Ho-Chuan (River) Huang
          Stata 19.0, MP(4)

          Comment

          Working...
          X