Shorter code for looping through consecutive variables.

Caroline Thirukumaran

Join Date: Jun 2014
Posts: 25

Shorter code for looping through consecutive variables.

22 Dec 2014, 09:43

Hi,

My dataset has 25 variables for diagnoses (dx1 to dx25). I use the following code to identify patients with particular conditions. Is there are way for me to shorten this code so that I do not have to specify a line for each diagnosis variable.

Code:

forvalues i=2009/2012 {
    use "/scratch/abc/XYZ/xyz_`i'_core.dta", clear
    keep if inlist(dx1, "75611", "75612") | ///
        inlist(dx2, "75611", "75612") | ///
        inlist(dx3, "75611", "75612") | ///
        inlist(dx4, "75611", "75612") | ///
        inlist(dx5, "75611", "75612") | ///
        inlist(dx6, "75611", "75612") | ///
        inlist(dx7, "75611", "75612") | ///
        inlist(dx8, "75611", "75612") | ///
        inlist(dx9, "75611", "75612") | ///
        inlist(dx10, "75611", "75612") | ///
        inlist(dx11, "75611", "75612") | ///
        inlist(dx12, "75611", "75612") | ///
        inlist(dx13, "75611", "75612") | ///
        inlist(dx14, "75611", "75612") | ///
        inlist(dx15, "75611", "75612") | ///
        inlist(dx16, "75611", "75612") | ///
        inlist(dx17, "75611", "75612") | ///
        inlist(dx18, "75611", "75612") | ///
        inlist(dx19, "75611", "75612") | ///
        inlist(dx20, "75611", "75612") | ///
        inlist(dx21, "75611", "75612") | ///
        inlist(dx22, "75611", "75612") | ///
        inlist(dx23, "75611", "75612") | ///
        inlist(dx24, "75611", "75612") | ///
        inlist(dx25, "75611", "75612")    
    save "/scratch/abc/XYZ_`i'_pqr.dta", replace
}

I am using Stata 12 for Windows.

Thank you,
Caroline

Tags: None

Nick Cox

Join Date: Mar 2014
Posts: 35700

22 Dec 2014, 10:00

Code:

local myif 

forval j = 1/24 { 
    local myif `myif' inlist(dx`j', "75611", "75612") | 
}

local myif `myif' inlist(dx25, "75611", 75612") 

forvalues i=2009/2012 {
     use "/scratch/abc/XYZ/xyz_`i'_core.dta", clear
     keep if `myif' 
     save "/scratch/abc/XYZ_`i'_pqr.dta", replace
}

Comment

ben earnhart

Join Date: May 2014
Posts: 1027

22 Dec 2014, 10:02

Try this:

Code:

forvalues i=2009/2012 {     
       use "/scratch/abc/XYZ/xyz_`i'_core.dta", clear           
       gen keeper=0           
       forvalues j=1/25 {             
           replace keeper=1 if inlist(dx`j', "75611", "75612")           
       }          
       keep if keeper==1          
       save "/scratch/abc/XYZ_`i'_pqr.dta", replace 
}

Last edited by ben earnhart; 22 Dec 2014, 10:04.

Comment

Caroline Thirukumaran

Join Date: Jun 2014

Posts: 25
#4

22 Dec 2014, 10:26

Thank you so much, Nick and Ben!
Comment
Joe Canner

Join Date: Mar 2014

Posts: 580
#5

22 Dec 2014, 11:48

Caroline,

Since I happen to know the data set you are using, I will offer my two cents worth. My usual technique is the one that Ben suggests, but Nick's is intriguing because with a few tweaks it might allow you to save some time. Since the NIS is very large, it tends to take a long time to load into memory. Accordingly, if you can use the "use if <condition> using <file>" notation, you can speed the process up considerably (depending on how big of a subset you are extracting. So, you can modify Nick's code as follows:

Code:

local myif forval j = 1/24 { local myif `myif' inlist(dx`j', "75611", "75612") | } local myif `myif' inlist(dx25, "75611", 75612") forvalues i=2009/2012 { use if `myif' using "/scratch/abc/XYZ/xyz_`i'_core.dta", clear save "/scratch/abc/XYZ_`i'_pqr.dta", replace }

Alternatively, if you find Nick's code to be too mysterious you can do the same thing with your code:

Code:

forvalues i=2009/2012 { use if inlist(dx1, "75611", "75612") | /// inlist(dx2, "75611", "75612") | /// ... inlist(dx25, "75611", "75612") "/scratch/abc/XYZ/xyz_`i'_core.dta", clear save "/scratch/abc/XYZ_`i'_pqr.dta", replace }

Regards,
Joe
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35700
#6

22 Dec 2014, 12:05

Joe's tweak looks good. I Googled 75611 to decode the magic number, but did not hit gold.

Readers living in or visiting London might like to know that

http://countdown.tfl.gov.uk/#%7CstopCode=75611

gives information on buses arriving at Edmonton Police Station.
Comment
Joe Canner

Join Date: Mar 2014

Posts: 580
#7

22 Dec 2014, 12:22

Caroline is using ICD-9 diagnosis codes. 75611 and 75612 (actually, 756.11 and 756.12) are "Spondylolysis" and "Spondylolisthesis", respectively (spine anomalies).
Comment
Caroline Thirukumaran

Join Date: Jun 2014

Posts: 25
#8

22 Dec 2014, 12:50

Thank you, Joe. This is fantastic. Yes, it took me a while to figure out methods to extract my cohorts. - use if - was a great resource i found on the stata website. And then I had access to some super-memory computers - so that bailed me out as well. Please feel free to share your experience / ideas / resources in managing NIS.

Nick: Yes, these are some of the ICD9 codes for some causes of lower back pain. They may be found as 756.11 and 756.12 on the web. The dataset eliminates the periods and codes them as strings.
Comment

Announcement

Shorter code for looping through consecutive variables.

Comment

Comment

Comment

Comment

Comment

Comment

Comment