Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Appending a dataset over years (waves) with different variables

    Dear All I have a dataset with 12 years (waves) and in each year, I have different datasets, which l merge within a wave and then append all waves (as I do have differing household or person numbers, along with other reasons).

    I tried a lot but could not overcome the issue of "keeping variables needed in a loop, when some variables exist in some waves but not in others. For instance:

    foreach w in a b c d e f g h i j k l {
          
    // Open the individual level file
          
    use "${data}/`w'_indresp", clear

    // Keep the variables needed
          
    keep pidp `w'_hidp `w'_pno `w'_marstat `w'_qfhigh `w'_paedqf `w'_maedqf `w'_qfhigh_dv `w'_sf1 `w'_scghqa `w'_scghqb `w'_scghqc `w'_scghqd `w'_scghqe `w'_scghqf `w'_scghqg `w'_scghqh `w'_scghqi `w'_scghqj `w'_scghqk `w'_scghql `w'_scghq1_dv `w'_scghq2_dv

    // Save one file for each wave
    save "${tempdata}/`w'_indresp", replace
          
    }

    In this loop, `w'_paedqf, education variable for parents, exist in let's say wave a but not b or wave c d e but not in the rest, hence when I loop over I get the error the variable does not exist like, b_paedqf

    Does anyone know how could I prevent this?

    I used the command
    ds
    or tried
    local

    but could not solve

    or in general, can l create a loop which takes into account the variables existing in some of the waves but not in others, as some waves have variables l need but others do not.

    Thanks

    Best Regards
    Omer

  • #2
    If you have enough core memory you could append the datasets first, and then do the -keep-.

    Otherwise I can imagine it would be possible to use -confirm- , -capture-, a -foreach- loop and some macros , or -describe- and the extended macro function -dup- to generate the correct -keep- statement on the fly but with only 12 files, probably easier to just have 12 -keep- statements generated by hand.

    However, don't you want to rename the variables before appending, dropping the `w'_ prefix from each one? Otherwise your appended dataset will a bunch of unrelated blocks.
    Last edited by Daniel Feenberg; 17 Nov 2023, 07:04.

    Comment

    Working...
    X