I am trying to combine three different datasets into one dataset, which will be (eventually) reshaped. All in all, the datasets contain the following variables:
Dataset 1
- var_i02
- var_i12
Dataset 2
- var_g02
Dataset 3
- var_h02
- var_h12
I'm trying to loop over the datasets and rename the variables as "var_02_datasetnumber" and "var_12_datasetnumber" so they can be later merged and reshaped. Of course, I could do this manually in few minutes. However, although that would solve the current problem, it won't work should the number of datasets and/or variables be higher. So, I would like to learn to write code that "scales" across different number datasets and variables.
Now, if the datasets included both variables, I wouldn't be asking for help. But as it turns out, looping does not work well with missing variable in the dataset 2, as there is nothing to be renamed. I solved this by checking first if the variables are present in the dataset, and if so, proceed to rename it. I first tried "confirm", but it does not work with wildcards needed in the variable names. I substituted it with "ds", which allows wildcards and produces an error message that can be used in the loop. The following code works, although I need to check one variable at the time:
So, I managed to check if the variables are in the dataset and rename them. Next, I need to the local macros to create a global macro, which contains the found variables. This global would be used in merging, so I don't have to manually check which variables are found in which dataset. And this is where I have serious problems:
So, please, help me to understand the macros and how to use them!
Dataset 1
- var_i02
- var_i12
Dataset 2
- var_g02
Dataset 3
- var_h02
- var_h12
I'm trying to loop over the datasets and rename the variables as "var_02_datasetnumber" and "var_12_datasetnumber" so they can be later merged and reshaped. Of course, I could do this manually in few minutes. However, although that would solve the current problem, it won't work should the number of datasets and/or variables be higher. So, I would like to learn to write code that "scales" across different number datasets and variables.
Now, if the datasets included both variables, I wouldn't be asking for help. But as it turns out, looping does not work well with missing variable in the dataset 2, as there is nothing to be renamed. I solved this by checking first if the variables are present in the dataset, and if so, proceed to rename it. I first tried "confirm", but it does not work with wildcards needed in the variable names. I substituted it with "ds", which allows wildcards and produces an error message that can be used in the loop. The following code works, although I need to check one variable at the time:
Code:
global PATH "C:\..Datasets"
foreach num in 7 8 9 {
use "$PATH\data_w`num'.dta", clear
capture ds var_?02
if !_rc {
local var1 r(varlist)
rename ``var1'' fi_test02_w`num'
}
capture ds var_?12
if !_rc {
local var2 r(varlist)
rename ``var2'' fi_test12_w`num'
}
su
global vars`num' = "Variables!"
}
- I discovered macros quite recently and don't know how to refer to them properly, as they act in mysterious ways! For example, I have no idea why
results in variable name, butCode:
rename ``var1'' fi_test02_w`num'
does not.Code:rename `var1' fi_test02_w`num'
- This is related to the first problem. I don't know how to make the global including the variable names (or is it even possible). This far I have read Stata manuals, the help files, and Stata forum. I have prayed, cursed, yelled, and waved a stick at the sky. But with scant success. I have also tried every combination of quotes imaginable. But no! Stata stubbornly refuses to put anything other than r(varlist) in the global.
So, please, help me to understand the macros and how to use them!

Comment