Dear Statlists,
I have a big dataset that I’m trying to split in multiple much smaller datasets. For this I use a number of foreach loops in combination with the collapse and reshape commands.
My code looks like this:
I repeat this routine several times, which results in a relatively long do file. The code provided above works fine when I run the foreach loops separately -- i.e. by selecting and running the first loop
And then by selecting and running the second loop
But when I try to run the entire do file I get the following error message: “already preserved”
I have a similar issue with the two foreach loops below
Here, when I try to run the second loop without the preserve command, the "cases_pathogen" variable calculated in the first loop appears only in the first dataset (out of the 9) created by the second loop. But again, when I run the loops separably (as described above), the code works fine.
I'd like to be able to run the entire do file without having to select and run different sections of the code.
Thanks in advance for your help.
I have a big dataset that I’m trying to split in multiple much smaller datasets. For this I use a number of foreach loops in combination with the collapse and reshape commands.
My code looks like this:
Code:
clear
use country22
*split dataset by setting (1=community; 2=health care)
preserve
foreach i of num 1/2 {
keep if setting2 == `i'
save Austria_setting2`i',
restore, preserve
}
*split dataset by pathogen (in health care setting)
clear
use Austria_setting22
preserve
foreach i of num 1/6 {
keep if pathogen2 == `i'
save AustriaHC_pathogen2`i',
restore, preserve
}
*** Create datasets for infection nodes (including resistant and susceptible infections)
**E.Colie
clear
use AustriaHC_pathogen21
collapse (sum) incidence_split proportion rate_pop ///
incidence_pop cases_pop cases_split , ///
by(age year population_pop population_split) cw
sa "EColie_Austria_both_v0.dta", replace
*collapse sex
clear
use EColie_Austria_both_v0.dta
collapse (sum) cases_split population_split (first) population_pop, ///
by(age year) cw
sa "EColie_Austria_both_v1.dta", replace
*compute transition probabilities
clear
use EColie_Austria_both_v1.dta
gen transprob= (1-exp(-(cases_split/population_split)*1))
drop cases_split population_split population_pop
sa "EColie_Austria_both_v2.dta", replace
*export dataset
clear
use EColie_Austria_both_v2.dta
reshape wide transprob, i(age) j(year)
export delimited using "EColie_Austria_both.csv", nolabel replace
Code:
clear
use country22
*split dataset by setting (1=community; 2=health care)
preserve
foreach i of num 1/2 {
keep if setting2 == `i'
save Austria_setting2`i',
restore, preserve
}
Code:
clear
use Austria_setting22
preserve
foreach i of num 1/6 {
keep if pathogen2 == `i'
save AustriaHC_pathogen2`i',
restore, preserve
}
I have a similar issue with the two foreach loops below
Code:
clear
use AustriaHC_pathogen21
*compute total number of cases by age and pathogen
foreach x of var age {
bysort year age: egen cases_pathogen = sum(cases_split)
}
*Split by resistance
preserve
foreach i of num 1/9 {
keep if resistant2 == `i'
save AustriaHC_pathogen21_resistant2`i',
restore, preserve
}
I'd like to be able to run the entire do file without having to select and run different sections of the code.
Thanks in advance for your help.

Comment