Hi everyone,
I am new to the forum so please excuse possible inaccuracies. I am construction a panel data set (time period: 2003-2016) out of 4 .dta files per year by merging them via a unique ID. Each of these files contains between 20 and 60 variables, which might be overlapping across data sets. For this purpose, I would like to program a loop that checks the intersections across data sets in every year and puts the variable names of these intersections into an excel file ideally (or at least give me an output in Stata). I thought of an approach similar to the one explained in an earlier post: http://www.statalist.org/forums/foru...ing-conflict-s
My code would conclude in having:
where the macro after lists contains the names of locals with an "&" in between. These locals, in turn, contain the names of variables I need to check.
However, I now saw that macro lists is only able to process two locals or globals containing variable names. My "wrong approach" looks like this:
There is probably a much easier approach to achieve this, so if you have any idea, I would be very grateful for help!
Thank you very much in advance.
Best regards,
Carlo
I am new to the forum so please excuse possible inaccuracies. I am construction a panel data set (time period: 2003-2016) out of 4 .dta files per year by merging them via a unique ID. Each of these files contains between 20 and 60 variables, which might be overlapping across data sets. For this purpose, I would like to program a loop that checks the intersections across data sets in every year and puts the variable names of these intersections into an excel file ideally (or at least give me an output in Stata). I thought of an approach similar to the one explained in an earlier post: http://www.statalist.org/forums/foru...ing-conflict-s
My code would conclude in having:
Code:
global common: list `varlists_all'
However, I now saw that macro lists is only able to process two locals or globals containing variable names. My "wrong approach" looks like this:
Code:
foreach i of global years { global files: dir "C:\Users\Carlo\Dropbox\Master\4. Semester\Masterarbeit\Data and statistical work\raw_data\DHS wave `i'" files "*.dta" cd "C:\Users\Carlo\Dropbox\Master\4. Semester\Masterarbeit\Data and statistical work\raw_data\DHS wave `i'" * Drop duplicates if present and alo drop variables that are not needed and that are conflicting in some datasets (see excel worshet for that) local varlists_all foreach file of global files { use "`file'" duplicates drop qui ds *copy all variable names of current file j into a macro if substr("`file'",1,3)!="wei" { local suffix=substr("`file'",1,3) local vlist_`suffix'_`i'=r(varlist) local varlists_all `varlists_all' vlist_`suffix'_`i' & di "`varlists_all'" di "end of file loop" } clear } * Check in the global common which variables are common to the data set * Eliminate last "&"- sign in the list local dropper &_todrop local varlists_all `varlists_all'_todrop local varlists_all: list varlists_all-dropper di "`varlists_all'" * Problem: one can only check two files at once!! di "global common: list `varlists_all'" * HERE is the Problem: *global common: list `varlists_all' * Extract macro into excel sheet *cd "$maindirectory" *putexcel A1="Variable" A2="$common" using "20170601_conflicting_variables, sheet("`i'") modify *cd "C:\Users\Carlo\Dropbox\Master\4. Semester\Masterarbeit\Data and statistical work\raw_data\DHS wave `i'" }
Thank you very much in advance.
Best regards,
Carlo
Comment