Dear all,
I'm writing a code in which I have to input the contents of a string variable (var1) to observations that have empty values for var1 but that belong to the same var2 group. A short example of the dataset can be found below (my original dataset has ~5000 observations on ~2000 groups of var2).
To achieve my needs, I created a code that replace one observation var1 value with the value of var1[_n+1] if var1=="", within values of var2. However, in my original dataset, since not all groups of var2 have the same amount of observations, I have to repeat the code several times in order to end up with all observations with an assigned var1. The code I use is:
In this code I set it to be repeated 10 times, even though only 5 are needed. On the original dataset, I have to run the code 67 times to do all changes needed. However, my dataset can be updated, so setting it to 67 (or any other arbitrary value) might not do the trick in the future. What I thought would be a good solution would be to get Stata to access the number of changes made on each iteration of the forvalues code, and then if that number is >0, repeat the loop. Do you think that is a good solution to the problem? Can you see another way out? Is there a way to code that in Stata?
In any case, thanks for your help!
Best;
I'm writing a code in which I have to input the contents of a string variable (var1) to observations that have empty values for var1 but that belong to the same var2 group. A short example of the dataset can be found below (my original dataset has ~5000 observations on ~2000 groups of var2).
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input int var2 str19 var1 3385 "L9" 3385 "" 3560 "L8" 3560 "" 3666 "Poor classification" 3666 "" 3560 "" 889 "L1" 889 "" 890 "L1" 893 "L1" 893 "" 892 "L1" 892 "" 892 "" 892 "" 892 "" 892 "" 891 "L1" 891 "" 891 "" end
Code:
sort var2 var1 forvalues i = 1/10{ by var2: replace var1 = var1[_n+1] if var1=="" }
In any case, thanks for your help!
Best;
Comment