Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to Create new ordervar in Sequence Analysis

    Dear Experts,

    I want to perform Sequence Analysis about individual migration history. However, I have difficulty in creating "order variable" that contains the seqeuence of age when migration happened. Every id in my dataset has a different starting point in terms of migration age. So i want to standardize them as an order variable in each id. Let me show the example:
    id age destination
    1 12 village
    1 13 city
    1 15 city
    2 14 village
    2 35 village
    2 40 city
    3 15 city
    3 20 city
    4 35 village

    So... i want to create new variable as a representation of age in standard form (start from 12 years old to 40 years old in each id) as follow :
    id age destination
    1 12 village
    1 13 city
    1 14 ..
    1 15 village
    1 16 ...
    1 ...(17 to 39) ...
    1 40 ...
    2 12 ...
    2 13 ...
    2 14 village
    2 ...(15 to 34) ...
    2 35 village
    2 ...(36 to 39) ...
    2 40 city
    and so on...

    Can anyone help how i make this dataset?

    Thank you in advance and I'm so sorry for trivial question..

    Best regards
    nang

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(id age) str7 destination
    1 12 "village"
    1 13 "city"   
    1 15 "city"   
    2 14 "village"
    2 35 "village"
    2 40 "city"   
    3 15 "city"   
    3 20 "city"   
    4 35 "village"
    end
    
    bysort id (age) : gen toexpand = age[_n+1] - age 
    expand toexpand 
    bysort id age : replace age = age[_n-1] + 1 if _n > 1 
    drop toexpand 
    
    list, sepby(id)
    
         +---------------------+
         | id   age   destin~n |
         |---------------------|
      1. |  1    12    village |
      2. |  1    13       city |
      3. |  1    14       city |
      4. |  1    15       city |
         |---------------------|
      5. |  2    14    village |
      6. |  2    15    village |
      7. |  2    16    village |
      8. |  2    17    village |
      9. |  2    18    village |
     10. |  2    19    village |
     11. |  2    20    village |
     12. |  2    21    village |
     13. |  2    22    village |
     14. |  2    23    village |
     15. |  2    24    village |
     16. |  2    25    village |
     17. |  2    26    village |
     18. |  2    27    village |
     19. |  2    28    village |
     20. |  2    29    village |
     21. |  2    30    village |
     22. |  2    31    village |
     23. |  2    32    village |
     24. |  2    33    village |
     25. |  2    34    village |
     26. |  2    35    village |
     27. |  2    36    village |
     28. |  2    37    village |
     29. |  2    38    village |
     30. |  2    39    village |
     31. |  2    40       city |
         |---------------------|
     32. |  3    15       city |
     33. |  3    16       city |
     34. |  3    17       city |
     35. |  3    18       city |
     36. |  3    19       city |
     37. |  3    20       city |
         |---------------------|
     38. |  4    35    village |
         +---------------------+

    Comment


    • #3
      Thank you for your respon Mr. Nick....

      but I want to set the starting age of migration in each id is 12 years old and the last one is 40 years old. So there are some missing values when in the certain age they didn't migrate..

      Comment


      • #4
        That entails extrapolation whenever the data don't stretch that far. It's one thing to interpolate; extrapolation is harder to defend.

        Sorry, but I won't suggest code for what doesn't seem to me to be a good idea in this case.

        Comment


        • #5
          Thank you for your explanation Mr. Nick, well, here the case...

          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input str9 pidlink byte(d agemg1 typedest1) float(yrmg1 motif1 TTdest1) byte typeasal1 float(asal1 tipemig1)
          "001220001" 12 33 2 1988 0 1200200 0 1200150 2
          "001220001" 13 35 0 1990 2 1200150 0 1200150 1
          "001220001" 14  . .    . .       . .       . .
          "001220001" 15  . .    . .       . .       . .
          "001220001" 16  . .    . .       . .       . .
          "001220001" 17  . .    . .       . .       . .
          "001220001" 18  . .    . .       . .       . .
          "001220001" 19  . .    . .       . .       . .
          "001220001" 20  . .    . .       . .       . .
          "001220001" 21  . .    . .       . .       . .
          "001220001" 22  . .    . .       . .       . .
          "001220001" 23  . .    . .       . .       . .
          "001220001" 24  . .    . .       . .       . .
          "001220001" 25  . .    . .       . .       . .
          "001220001" 26  . .    . .       . .       . .
          "001220001" 27  . .    . .       . .       . .
          "001220001" 28  . .    . .       . .       . .
          "001220001" 29  . .    . .       . .       . .
          "001220001" 30  . .    . .       . .       . .
          "001220001" 31  . .    . .       . .       . .
          "001220001" 32  . .    . .       . .       . .
          "001220001" 33  . .    . .       . .       . .
          "001220001" 34  . .    . .       . .       . .
          "001220001" 35  . .    . .       . .       . .
          "001220001" 36  . .    . .       . .       . .
          "001220001" 37  . .    . .       . .       . .
          "001220001" 38  . .    . .       . .       . .
          "001220001" 39  . .    . .       . .       . .
          "001220001" 40  . .    . .       . .       . .
          "001220001" 41  . .    . .       . .       . .
          "001220001" 42  . .    . .       . .       . .
          "001220001" 43  . .    . .       . .       . .
          "001220001" 44  . .    . .       . .       . .
          "001220001" 45  . .    . .       . .       . .
          "002200001" 12  . .    . .       . .       . .
          "002200001" 13  . .    . .       . .       . .
          "002200001" 14  . .    . .       . .       . .
          "002200001" 15  . .    . .       . .       . .
          "002200001" 16  . .    . .       . .       . .
          "002200001" 17  . .    . .       . .       . .
          "002200001" 18  . .    . .       . .       . .
          "002200001" 19  . .    . .       . .       . .
          "002200001" 20  . .    . .       . .       . .
          "002200001" 21  . .    . .       . .       . .
          "002200001" 22  . .    . .       . .       . .
          "002200001" 23  . .    . .       . .       . .
          "002200001" 24  . .    . .       . .       . .
          "002200001" 25  . .    . .       . .       . .
          "002200001" 26  . .    . .       . .       . .
          "002200001" 27  . .    . .       . .       . .
          "002200001" 28 19 0 1979 1 1212320 1 1200120 2
          "002200001" 29 22 0 1982 0 1200120 1 1200120 1
          "002200001" 30  . .    . .       . .       . .
          "002200001" 31  . .    . .       . .       . .
          "002200001" 32  . .    . .       . .       . .
          "002200001" 33  . .    . .       . .       . .
          "002200001" 34  . .    . .       . .       . .
          "002200001" 35  . .    . .       . .       . .
          "002200001" 36  . .    . .       . .       . .
          "002200001" 37  . .    . .       . .       . .
          "002200001" 38  . .    . .       . .       . .
          "002200001" 39  . .    . .       . .       . .
          "002200001" 40  . .    . .       . .       . .
          "002200001" 41  . .    . .       . .       . .
          "002200001" 42  . .    . .       . .       . .
          "002200001" 43  . .    . .       . .       . .
          "002200001" 44  . .    . .       . .       . .
          "002200001" 45  . .    . .       . .       . .
          end
          I want to fill the missing values, if values of variable "d" is equal to agemg1. So, in the first row, I want to relocate all of values in var typedest1 to typemig1 into values/row "33" in variable d, and let the first row (d=12) missing because there is not matching values between d and agemg1. In sum, i want to relocate all values if condition "agemg1" equal to "d". What i should to do for this situations?

          Thank you

          Comment


          • #6
            See now https://www.statalist.org/forums/for...-columns-match

            Last edited by Nick Cox; 09 Nov 2022, 01:54.

            Comment


            • #7
              If the status of migration did not change you copy the status from year to year. Normally I do this using -carryforward- from ssc but there might be better or more native solutions. -carryforward- takes care that it stops when a nonmissing value occurs.
              When data is clearly missing because you have no information how the migration status is, I would insert a new category "migration status missing" for this respective year. But of course in Sequence analyses you have likely getting persons grouped together that have the same patterns of missing status information.

              Comment


              • #8
                Thank you for your kind responses Mr Nick Cox and Mr. Marc Kaulisch

                Comment

                Working...
                X