Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to keep ONLY individuals who respond every year in a household panel data

    Hello everyone,

    I am new to Stata and as a part of my bachelor thesis, I have to do a fixed effect regression based on three waves of household panel survey. I initially have three separate cross section data of the survey each year and I appended all three waves into one data set. However, for my regression analysis I only want to observe individuals that respond to all three surveys. Four variables are of interest here: persnr (the personal identification number), 1985, 1990 and 1992 (0 if the respondent does not respond to the questioner in that year, 1 if he or she does).
    Which command or function allows me to drop respondents that do not answer all three of the questionnaires?
    I would appreciate any help or advice that you can give me about this matter. Thank you very much.

    Regards,
    Daisy

  • #2
    You description of the structure of your data is unclear to me (and I suspect others) with respect to how the year indicators are treated, among other things. Please access the StataList FAQ (black button at the top of the screen), and in particular search for and read the section that describes how to use the -dataex- command to post an example of your data set.

    Comment


    • #3
      I also think that you messed up the explanation a bit, but I guess something like this would do the trick (if I understood correctly the structure of your problem).

      Say the dummy variable saying whether the person answered in each year is called 'answered'

      egen totalanswers = total(answered), by(persnr)

      keep if totalanswers==3

      Comment


      • #4
        Thank you for your response Mr. Lacy.
        Here is an example of a small part of my data:
        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input float makeid byte(x1110585 x1110590 x1110592)
         1 1 . .
         1 . . 1
         1 . 1 .
         2 . 1 .
         2 1 . .
         2 . . 1
         3 . 1 .
         3 1 . .
         4 . . 1
         4 . 1 .
         4 1 . .
         5 . . 1
         5 . 1 .
         5 1 . .
         6 1 . .
         6 . . 1
         6 . 1 .
         7 1 . .
         8 . 1 .
         8 . . 1
         8 1 . .
         9 1 . .
         9 . . 1
         9 . 1 .
        Information about the variables:
        makeid: personal identification number
        x1110585: Individual responded to survey in 1985. 0 for no, 1 for yes.
        x1110590: Individual responded to survey in 1990. 0 for no, 1 for yes.
        x1110592: Individual responded to survey in 1992. 0 for no, 1 for yes.

        As you can see, for example, Individual 1 appears three times, meaning he or she has responded to all three years of the survey. On the other hand, Individual 3 and Individual 7 responded only twice and once respectively. I would like to use only individuals who responds in all three years ('85, '90 and '92).

        Regards,

        Daisy

        Comment


        • #5
          Originally posted by Joro Kolev View Post
          I also think that you messed up the explanation a bit, but I guess something like this would do the trick (if I understood correctly the structure of your problem).

          Say the dummy variable saying whether the person answered in each year is called 'answered'

          egen totalanswers = total(answered), by(persnr)

          keep if totalanswers==3
          This is a brilliant solution. I just tried it and it works. Thank you very much for your help!

          Comment

          Working...
          X