Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dropping variables from multiple waves

    Hello everyone. New self-taught STATA-user looking for some advice...

    Question: I have 8 waves of panel data and would like to drop a number of variables from all waves at once. I would like to use some sort of macro/loop combination so that I don't have to do it all for them individually. How would you go about doing this?

    I thought that the best way to go about doing this was to write a foreach loop inside a global macro? Does this sound right?

    Just wanted to make sure I'm going down the right path before I spend an hour or two doing the wrong thing.

    Thanks in advance!




  • #2
    If you want to drop variables that means dropping them for all observations. No need for a loop.

    Alternatively, you really need to explain why you want to drop them and why you want you think it needs a loop.

    Alternatively, you may be confused about what a variable is. Note

    variable = column = field

    observation = row = case = record

    Also, point 18 in the FAQ Advice does apply.

    Comment


    • #3
      Hi Nick, thanks for your reply.

      The dataset I'm working with is the old ECHP (8 waves) which has a number of files (personal, household, register, etc.). I have already appended all the personal files that I need, but I need to supplement that with a few variables from the household file (household income / region of residence, etc.). I want to end up with a person/period file with an additional number of variables from the household file.

      This was my plan of action:
      1. Append personal files for the 8 waves (done).
      2. Drop unnecessary variables in household files from 8 waves which will leave only the variables that I want to add to the personal file. <---- It's this step I need help with.
      3. Append the household files.
      4. Merge appended-household files with appended-personal files on HID/PID.
      5. This would give me my 8 waves of the personal file supplemented with variables from the household file.

      I could go one-by-one and just drop the unnecessary variables from each wave, but that's not very efficient. This is why I am trying to figure out how I can tell Stata to drop vars A-X in the household files from waves 1-8.

      Therefore I thought the best way to do this was: 1. Macro: tell Stata to go successively from wave to wave and 2. Foreach loop: tell it to drop vars A-X.

      So, is this not the right approach, then?
      Last edited by ACarroll; 15 May 2014, 15:59.

      Comment


      • #4
        It's a while since I used the ECHP, and I can't remember the structure. Does the same variable have the same name in each wave? Or does it have a wave-specific prefix or suffix? If the variable has the same name, dropping it once is enough. If it doesn't, you ought to rename to a common name before appending. I'd also suggest merging household information before appending (since the HIDs are probably unique only within wave):

        1: for each wave, merge your chosen personal and hhd variables; rename variables to lose the wave-specific suffix/prefix, create a wave-number variable, and save
        2: append

        Also note that you can specify which variables to load as part of the use command, and as an option to the merge command.

        Comment


        • #5
          Hi Brendan,

          Originally posted by Brendan Halpin View Post
          Does the same variable have the same name in each wave?
          Yes.

          Originally posted by Brendan Halpin View Post
          I'd also suggest merging household information before appending (since the HIDs are probably unique only within wave):
          Good point. I'm not 100% sure about the HID, but I am about he PID. In that case PID is indeed the best option.

          Originally posted by Brendan Halpin View Post
          1: for each wave, merge your chosen personal and hhd variables; rename variables to lose the wave-specific suffix/prefix, create a wave-number variable, and save
          2: append
          Thanks for the tips, I might have to do this. I was planning to do this originally, but I wanted to take this opportunity to learn a more advanced technique that would allow me to do this more efficiently since I will be doing this a lot in the future. So at this moment in time, since I am trying to learn this software, I'd like to figure out which commands would allow me to do a maneuver like I described above.
          Last edited by ACarroll; 15 May 2014, 16:20.

          Comment


          • #6
            The way to use advanced code to enhance efficiency that I would suggest is to automate step 1 above, such that you can repeat it across wave without writing new code. For instance (untested code and making assumptions about file and variable names):

            Code:
            forvalues wave = 1/8 {
                use pid hid income using echp_ind_`wave', clear
                merge m:1 hid using echp_hhd_`wave', keepvar(hhdsize hnkids)
                drop _merge
                gen wave = `wave'
                save merged`wave', replace
            }
            
            use merged`1'
            forvalues wave = 2/8 {
                append using merged`wave'
            }

            Comment


            • #7
              Bredan: great. It's late here (Brussels), but I'll study this, play around with it tomorrow, and let you know how things go. Thanks for pointing me in the right direction.

              Comment

              Working...
              X