Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • gen new variable

    hey,

    I'm using panel data (wave 2-8) and a dummy variable ehc28p1, which says if the anchor person is living in the same flat (like the wave before) or moved. It was asked the first time in wave 3.

    PHP Code:
    ehc28p1 -- Jetzt leben in Wohnung Vorwelle [Lebensmittelpunkt] (EHC)
    --------------------------------------------------------------
                     |      
    Freq.    Percent      Valid       Cum.
    -----------------+--------------------------------------------
    Valid   0 0 Nein |        100       1.89       2.11       2.11
            1 1 Ja   
    |       4645      87.97      97.89     100.00
            Total    
    |       4745      89.87     100.00          
    Missing 
    .        |        535      10.13                      
    Total            
    |       5280     100.00                      
    -------------------------------------------------------------- 

    I’d like to check the movement of the anchor as UV in the fixed effects model.

    my question: how the generate this new variable?


    Thank You!

  • #2
    Guest:
    you may want to try something along the following lines:
    Code:
    use http://www.stata-press.com/data/r14/nlswork.dta
    bysort idcode (year): g flag=1 if msp[_n]==msp[_n-1] & msp!=.
    bysort idcode: replace flag=0 if msp[_n]!=msp[_n-1] & msp!=.
    Last edited by sladmin; 28 Jan 2019, 09:21. Reason: anonymize original poster
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Originally posted by Carlo Lazzaro View Post
      Guest:
      you may want to try something along the following lines:
      Code:
      use http://www.stata-press.com/data/r14/nlswork.dta
      bysort idcode (year): g flag=1 if msp[_n]==msp[_n-1] & msp!=.
      bysort idcode: replace flag=0 if msp[_n]!=msp[_n-1] & msp!=.
      thank you so much!

      I tried it:

      PHP Code:
      -----------------------------------------------------------
                    |      
      Freq.    Percent      Valid       Cum.
      --------------+--------------------------------------------
      Valid   0     |       7721      27.06      27.07      27.07
              1     
      |      20797      72.88      72.93     100.00
              Total 
      |      28518      99.94     100.00          
      Missing 
      .     |         16       0.06                      
      Total         
      |      28534     100.00                      
      ----------------------------------------------------------- 

      So 1 means that it is still the same spouse like it was before and 0 that its not. right?

      For my data its:

      Code:
      bysort id (wave): g move=1 if ehc28p1[_n]==ehc28p1[_n-1] & ehc28p1!=.
      bysort id: replace move=0 if ehc28p1[_n]!=ehc28p1[_n-1] & ehc28p1!=.
      PHP Code:
      move
      -----------------------------------------------------------
                    |      
      Freq.    Percent      Valid       Cum.
      --------------+--------------------------------------------
      Valid   0     |       1550      29.36      32.67      32.67
              1     
      |       3195      60.51      67.33     100.00
              Total 
      |       4745      89.87     100.00          
      Missing 
      .     |        535      10.13                      
      Total         
      |       5280     100.00                      
      ----------------------------------------------------------- 

      I think this is exactly what i wanted!


      Last edited by sladmin; 28 Jan 2019, 09:21. Reason: anonymize original poster

      Comment


      • #4
        Guest:
        your interpretation is correct:
        ...1 means that it is still the same spouse like it was before and 0 that its not...
        .
        Your code tweaking looks correct as well.
        Missing data are always annoying: it is up to you (according to the reserach protocol) to decide to leave them to their own devices or -mi- them, if feasible.
        Last edited by sladmin; 28 Jan 2019, 09:22. Reason: anonymize original poster
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Originally posted by Carlo Lazzaro View Post
          Missing data are always annoying: it is up to you (according to the reserach protocol) to decide to leave them to their own devices or -mi- them, if feasible.
          before I decided that i need the 'move' variable i coded the missings before I started to generate the other new variables.
          But because the variable "etc28p1" for the movement was asked the first time in wave 3 Stata deleted all my cases of wave 2.
          Thats the next problem I have to solve. Do you have an Idea?

          Comment


          • #6
            Guest:
            I fail to get what you did and how Stata reacted accordingly.
            Do you mind to provide the list with further details? Thanks.
            Last edited by sladmin; 28 Jan 2019, 09:22. Reason: anonymize original poster
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Originally posted by Carlo Lazzaro View Post
              Guest:
              I fail to get what you did and how Stata reacted accordingly.
              Do you mind to provide the list with further details? Thanks.
              Yes sure. Maybe its more clear like this:

              HTML Code:
              ID        WAVE            SEX                  ehc28p1
               
              1          2                     w                    -
              1          3                     w                    1
              1          4                     w                    1
              1          5                     w                    1
              1          6                     w                    1
              1          7                     w                    0
              1          8                     w                    1
              2          2                     m                    -
              2          3                     m                    0
              3          2                     w                    -
              3          3                     w                    1
              ehc28p1 was asked for the first time in wave 3.. which gives my information about the previous year (still in the same flat like last year or not). Because there are no information for wave 2 it's (of course) set as a missing.

              Before I run the fe models I decode the missings like this:

              Code:
              mvdecode _all ,mv (-99/-1=.a)
              egen missings = rowmiss (_all)
              keep if missings == 0
              Now, wave 2 is deleted completely because I told stata to do so. I don't know how to handle this..
              Last edited by sladmin; 28 Jan 2019, 09:22. Reason: anonymize original poster

              Comment


              • #8
                Guest:
                two remarks about you last post:
                - retrieve the original dataset and apply the code you tweaked according to your needs in #3;
                - otherwise, you have to deal with a case of left-censoring: that is, you do not know what happened to marital status before wave 3.
                If you research protocol states that the question was asked from the 3rd wave onwards, no problem.
                Otherwise, you can present your results based on you current version of the dataset (that is, no information on marital status for waves 1 and 2) and perform a sort of sensitivity analysis on them filling in via -mi- (if feasible) the missing data.
                Last edited by sladmin; 28 Jan 2019, 09:22. Reason: anonymize original poster
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Just an additional comment, after Carlo's very helpful replies. In order to avoid such a situation (having "completely deleted" data from a given wave), Gyest may wish to start by using - preserve - , then do the estimations, finally - restore - the pristine dataset. An alternative can be the use of the "if" clause as well.
                  Last edited by sladmin; 28 Jan 2019, 09:22. Reason: anonymize original poster
                  Best regards,

                  Marcos

                  Comment


                  • #10
                    Thank's for your advices. I'll try again to explain it better:

                    The Variable ehc28p1 was asked the first time in wave 3. It has information about the wave before. So there is a missing for wave 2 which is totally clear. What can I do, that I don't have to drop wave 2? Is this a case of left censoring? It would reduce my number of cases so I'd like to keep the wave for my xtreg, fe.

                    Code:
                    * Example generated by -dataex-. To install: ssc install dataex
                    clear
                    input long(id cid) str9 str_cid byte(wave ehc28p1)
                      111000   111203 "000111203" 2 .
                      111000   111203 "000111203" 3 0
                      111000   111203 "000111203" 4 1
                      111000   111203 "000111203" 5 1
                      111000   111203 "000111203" 6 1
                      111000   111203 "000111203" 7 1
                      111000   111203 "000111203" 8 1
                      907000   907201 "000907201" 8 1
                     1300000  1300202 "001300202" 2 .
                     1300000  1300202 "001300202" 3 1
                     1300000  1300202 "001300202" 4 1
                     1624000  1624201 "001624201" 8 1
                     2767000  2767201 "002767201" 2 .
                     2767000  2767201 "002767201" 3 1
                     2767000  2767201 "002767201" 4 1
                     3491000  3491201 "003491201" 8 1
                     3902000  3902201 "003902201" 5 1
                     3902000  3902201 "003902201" 6 1
                     3902000  3902201 "003902201" 7 1
                     3902000  3902201 "003902201" 8 1
                     4814000  4814203 "004814203" 5 1
                     4835000  4835201 "004835201" 2 .
                     4835000  4835201 "004835201" 3 1
                     4835000  4835201 "004835201" 4 1
                     4835000  4835201 "004835201" 5 1
                     4858000  4858201 "004858201" 4 1
                     4858000  4858201 "004858201" 5 1
                     4858000  4858201 "004858201" 6 1
                     4858000  4858201 "004858201" 7 1
                     4858000  4858201 "004858201" 8 1
                     5780000  5780204 "005780204" 2 .
                     6151000  6151201 "006151201" 5 1
                     6151000  6151201 "006151201" 6 1
                     6151000  6151201 "006151201" 7 1
                     6151000  6151201 "006151201" 8 1
                     6519000  6519201 "006519201" 4 1
                     6519000  6519201 "006519201" 5 1
                     6519000  6519201 "006519201" 6 1
                     6519000  6519201 "006519201" 7 1
                     6519000  6519201 "006519201" 8 1
                    end
                    label values wave WAVE_prt2
                    label def WAVE_prt2 2 "2 2009/10", modify
                    label values ehc28p1 liste160a_ac3
                    label def liste160a_ac3 0 "0 Nein", modify
                    label def liste160a_ac3 1 "1 Ja", modify
                    label var id "Personennummer Anker"
                    label var cid "Personennummer Kind"
                    label var str_cid "Personennummer Kind"
                    label var wave "Erhebungsjahr"
                    label var ehc28p1 "Jetzt leben in Wohnung Vorwelle [Lebensmittelpunkt] (EHC)"

                    Furthermore I need this variable to gen a variable which says if the person moved or not. This is how I generated the variable:
                    Code:
                    bysort id (wave): g move=0 if ehc28p1[_n]==ehc28p1[_n-1] & ehc28p1!=.
                    bysort id: replace move=1 if ehc28p1[_n]!=ehc28p1[_n-1] & ehc28p1!=.
                    
                    label variable move "moved house"
                    label define move 0 "no move"
                    label values move move
                    Thanks in advance

                    Guest

                    I'm using Stata 14.2
                    Last edited by sladmin; 28 Jan 2019, 09:23. Reason: anonymize original poster

                    Comment


                    • #11
                      Guest:
                      - I see your case as an example of left censored data;
                      - it woud reduce the number of observations in your panel data regression model;
                      - perhaps you can consider -mi- or -ipolate- to deal with missing values.
                      Last edited by sladmin; 28 Jan 2019, 09:23. Reason: anonymize original poster
                      Kind regards,
                      Carlo
                      (Stata 19.0)

                      Comment


                      • #12
                        Originally posted by Carlo Lazzaro View Post
                        Guest:
                        - I see your case as an example of left censored data;
                        - it woud reduce the number of observations in your panel data regression model;
                        - perhaps you can consider -mi- or -ipolate- to deal with missing values.
                        Thank you! I decided to drop these observations and use wave 3-8. There are lots of other problems I have to deal with. I'll mention the left censored data problem in the discussion.
                        Last edited by sladmin; 28 Jan 2019, 09:23. Reason: anonymize original poster

                        Comment


                        • #13
                          Guest:
                          you should also give the reviewer/supervisor/professor sound justifications for focusing on waves 3-8 only, as you cannot rule out on an a priori basis that missingness in not informative.
                          Last edited by sladmin; 28 Jan 2019, 09:23. Reason: anonymize original poster
                          Kind regards,
                          Carlo
                          (Stata 19.0)

                          Comment


                          • #14
                            Originally posted by Carlo Lazzaro View Post
                            Guest:
                            you should also give the reviewer/supervisor/professor sound justifications for focusing on waves 3-8 only, as you cannot rule out on an a priori basis that missingness in not informative.
                            Yes I know..I can't asked my professor what to do before I have to hand in my master thesis. I'd like du "generate" the left censored data somehow. Do you know how this works? I don't unterstand the instructions I found in online respectively don't know which of them is right for my data.

                            Thank's again!
                            Last edited by sladmin; 28 Jan 2019, 09:23. Reason: anonymize original poster

                            Comment


                            • #15
                              Guest:
                              see my reply at: https://www.statalist.org/forums/for...tem-is-missing
                              Last edited by sladmin; 28 Jan 2019, 09:24. Reason: anonymize original poster
                              Kind regards,
                              Carlo
                              (Stata 19.0)

                              Comment

                              Working...
                              X