Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • problems with the command replace

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte id str2 stratum float(exp sex) byte(ISCEDlevel EDUCATTAIN) float(HF10year t HF10age chISCED chptISCED ISCEDchange)
     6 "19" 0 2 1 1 1997  1 17.67 . . 6
     6 "19" 0 2 1 . 1998  2 18.67 0 0 6
     6 "19" 0 2 1 . 1999  3 19.67 0 0 6
     6 "19" 0 2 2 . 2000  4 20.67 1 1 6
     6 "19" 0 2 2 . 2001  5 21.67 0 0 6
     6 "19" 0 2 2 . 2002  6 22.67 0 0 6
     6 "19" 0 2 5 . 2003  7 23.67 3 1 6
     6 "19" 0 2 5 . 2004  8 24.67 0 0 6
     6 "19" 0 2 5 . 2005  9 25.67 0 0 6
     6 "19" 0 2 7 7 2006 10 26.67 2 1 6
     6 "19" 0 2 7 . 2007 11 27.67 0 0 6
     6 "19" 0 2 7 . 2008 12 28.67 0 0 6
     6 "19" 0 2 7 . 2009 13 29.67 0 0 6
     6 "19" 0 2 7 . 2010 14 30.67 0 0 6
     6 "19" 0 2 7 . 2011 15 31.67 0 0 6
     6 "19" 0 2 7 . 2012 16 32.67 0 0 6
     6 "19" 0 2 7 . 2013 17 33.67 0 0 6
     6 "19" 0 2 7 . 2014 18 34.67 0 0 6
     7 "99" 1 2 1 1 1995  1  16.5 . . 4
     7 "99" 1 2 1 . 1996  2  17.5 0 0 4
     7 "99" 1 2 1 . 1997  3  18.5 0 0 4
     7 "99" 1 2 1 . 1998  4  19.5 0 0 4
     7 "99" 1 2 2 . 1999  5  20.5 1 1 4
     7 "99" 1 2 2 . 2000  6  21.5 0 0 4
     7 "99" 1 2 2 . 2001  7  22.5 0 0 4
     7 "99" 1 2 2 . 2002  8  23.5 0 0 4
     7 "99" 1 2 2 . 2003  9  24.5 0 0 4
     7 "99" 1 2 2 . 2004 10  25.5 0 0 4
     7 "99" 1 2 2 . 2005 11  26.5 0 0 4
     7 "99" 1 2 5 5 2007 12  28.5 3 1 4
     7 "99" 1 2 5 . 2008 13  29.5 0 0 4
     7 "99" 1 2 5 . 2009 14  30.5 0 0 4
     7 "99" 1 2 5 . 2010 15  31.5 0 0 4
     7 "99" 1 2 5 . 2011 16  32.5 0 0 4
     7 "99" 1 2 5 . 2012 17  33.5 0 0 4
     7 "99" 1 2 5 . 2013 18  34.5 0 0 4
     7 "99" 1 2 5 . 2014 19  35.5 0 0 4
    17 "45" 0 2 1 1 2009  1 14.21 . . 0
    17 "45" 0 2 1 . 2010  2 15.21 0 0 0
    17 "45" 0 2 1 . 2011  3 16.21 0 0 0
    17 "45" 0 2 1 . 2012  4 17.21 0 0 0
    17 "45" 0 2 1 . 2013  5 18.21 0 0 0
    17 "45" 0 2 1 . 2014  6 19.21 0 0 0
    end
    Dear Statlisters

    The dataex above is a fictive sample from a large ( N 100 000+) observational epidemiological study using record linkage.My primary interest is educational attainment , i.e. highest completed level of education. In the dataex.dta the variable of interest is named EDUCATTAIN. I have got it almost right, but the persons who have not any education after O-levels (9/10th grade) gives me problems.
    The command
    replace EDUCATTAIN = 1 if ISCEDlevel[_N]==1 & t ==1 yielded the actual dataex.dta
    All three persons have EDUCATTAIN ==1 at t[1], not only those with ISCEDlevel ==1 at [_N]
    Where is my thinking wrong?
    Happy New Year
    Søren Nielsen

  • #2
    In the command

    Code:
    replace EDUCATTAIN = 1 if ISCEDlevel[_N]==1 & t ==1
    _N refers to the last observation in the entire dataset. I imagine that you need something more like

    Code:
    bysort id (HF10year) : replace EDUCATTAIN = 1 if ISCEDlevel[_N]==1 & t ==1

    Comment


    • #3
      I think the dataset is sorted, but I will try your suggestion right away

      Comment


      • #4
        When usimg the suggested command Stat responds with (0 real changes)

        Comment


        • #5
          The issue is not so much whether or how the data are sorted, as what you do think that a reference to [_N] does in your code. I can't see that referring to the last observation in the dataset makes any sense for this kind of data.

          Comment


          • #6
            I thoughtn that [_N] in the context of bysort always referred to the last observation in each group (person) , i.e. the variable t

            Comment


            • #7
              Correct, but the command you gave in #1 makes no mention of bysort.

              Comment


              • #8
                Sorry, I forgot to mention that

                Comment


                • #9
                  I tried another way around the problem, which seem to solve my problem

                  bysort id (HF10year): replace EDUCATTAIN = . if ISCEDlevel[_N]>1 & t==1
                  (2 real changes made, 2 to missing)

                  Thanks a lot for your effort

                  Comment


                  • #10
                    OK, but I didn't get further than noticing that your command in #1 made no obvious sense. If we focus on the data relevant to the question:


                    Code:
                    . list id t ISCEDlevel EDUCATTAIN , sepby(id)
                    
                         +-------------------------------+
                         | id    t   ISCEDl~l   EDUCAT~N |
                         |-------------------------------|
                      1. |  6    1          1          1 |
                      2. |  6    2          1          . |
                      3. |  6    3          1          . |
                      4. |  6    4          2          . |
                      5. |  6    5          2          . |
                      6. |  6    6          2          . |
                      7. |  6    7          5          . |
                      8. |  6    8          5          . |
                      9. |  6    9          5          . |
                     10. |  6   10          7          7 |
                     11. |  6   11          7          . |
                     12. |  6   12          7          . |
                     13. |  6   13          7          . |
                     14. |  6   14          7          . |
                     15. |  6   15          7          . |
                     16. |  6   16          7          . |
                     17. |  6   17          7          . |
                     18. |  6   18          7          . |
                         |-------------------------------|
                     19. |  7    1          1          1 |
                     20. |  7    2          1          . |
                     21. |  7    3          1          . |
                     22. |  7    4          1          . |
                     23. |  7    5          2          . |
                     24. |  7    6          2          . |
                     25. |  7    7          2          . |
                     26. |  7    8          2          . |
                     27. |  7    9          2          . |
                     28. |  7   10          2          . |
                     29. |  7   11          2          . |
                     30. |  7   12          5          5 |
                     31. |  7   13          5          . |
                     32. |  7   14          5          . |
                     33. |  7   15          5          . |
                     34. |  7   16          5          . |
                     35. |  7   17          5          . |
                     36. |  7   18          5          . |
                     37. |  7   19          5          . |
                         |-------------------------------|
                     38. | 17    1          1          1 |
                     39. | 17    2          1          . |
                     40. | 17    3          1          . |
                     41. | 17    4          1          . |
                     42. | 17    5          1          . |
                     43. | 17    6          1          . |
                         +-------------------------------+
                    your puzzlement to me raises the question of how EDUCATTAIN was generated in the first place.

                    Comment


                    • #11
                      From the variable ISCEDlevel I first located changepoints informed by your paper pr0004 p 92 -top - and then bysort id (HF10year): generate byte EDUCATTAIN = ISCEDlevel[_N] & chptISCED==1

                      Comment


                      • #12
                        Not sure I follow, but good that you appear to have resolved the problem.

                        Comment

                        Working...
                        X