Hi,
Could someone kindly point out where I am going wrong in trying to identify recode specific numbers within a variable as a new variable?
I have a large CPRD Aurum dataset extract with medcodeids, but no attached "label" (dataset 1). I a have a separate .dta file with medcodeids grouped by the underlying meaning eg "chronic kidney disease(CKD)" term or "GFR" term (dataset 2). I have taken all of the medcodeids from dataset 2 relating to GFR, and am trying to identify them within dataset 1, such that I can start to group them. However, when I run the code below, every single observation changes to "1" for GFR, despite the medcodeids NOT matching those that I have put in the code. Terrible explanation sorry, hoping the code below explains better!
An extract from dataset 1:
obs_id double(patid medcodeid) float GFR
1 1000000020274 380389013 1
2 1000000320274 976481000006110 1
3 1000000320274 976481000006110 1
4 1000000320274 976481000006110 1
5 1000000320274 380389013 1
6 1000000320274 380389013 1
7 1000000320274 380389013 1
8 1000000420274 380389013 1
9 1000000420274 380389013 1
10 1000000420274 976481000006110 1
11 1000000420274 380389013 1
12 1000000420274 976481000006110 1
13 1000000420274 976481000006110 1
14 1000000620274 380389013 1
15 1000000620274 380389013 1
16 1000000620274 380389013 1
17 1000000620274 380389013 1
18 1000000620274 282610015 1
19 1000000620274 133205018 1
20 1000000620274 304071000000115 1
To which I have applied the following code:
gen GFR=0
replace GFR=1 if medcodeid==976481000006110|1942831000006114|545152 1000006118|371441000000114|12621921000006110|18549 91000006119|8352981000006116|1540241000006111|1332 05018|8250311000006118|1942821000006111|1268044100 0006112|1744631000006112|8069731000006118|12621931 000006112|1866321000006117| 8294821000006118
As you can see, the medcodeid for obs 1 (380389013) is not found in the above list, and yet GFR is tagged as "1".
a) Could someone possible point out my error?
b) If you had a magical way of applying the medcodeid label (ie "GFR"/"CKD") to dataset 1, without having to manually copy the medcodeids from dataset 2 and re-write them into a list, that would be even better!
Thanks very much.
Jemima
Could someone kindly point out where I am going wrong in trying to identify recode specific numbers within a variable as a new variable?
I have a large CPRD Aurum dataset extract with medcodeids, but no attached "label" (dataset 1). I a have a separate .dta file with medcodeids grouped by the underlying meaning eg "chronic kidney disease(CKD)" term or "GFR" term (dataset 2). I have taken all of the medcodeids from dataset 2 relating to GFR, and am trying to identify them within dataset 1, such that I can start to group them. However, when I run the code below, every single observation changes to "1" for GFR, despite the medcodeids NOT matching those that I have put in the code. Terrible explanation sorry, hoping the code below explains better!
An extract from dataset 1:
obs_id double(patid medcodeid) float GFR
1 1000000020274 380389013 1
2 1000000320274 976481000006110 1
3 1000000320274 976481000006110 1
4 1000000320274 976481000006110 1
5 1000000320274 380389013 1
6 1000000320274 380389013 1
7 1000000320274 380389013 1
8 1000000420274 380389013 1
9 1000000420274 380389013 1
10 1000000420274 976481000006110 1
11 1000000420274 380389013 1
12 1000000420274 976481000006110 1
13 1000000420274 976481000006110 1
14 1000000620274 380389013 1
15 1000000620274 380389013 1
16 1000000620274 380389013 1
17 1000000620274 380389013 1
18 1000000620274 282610015 1
19 1000000620274 133205018 1
20 1000000620274 304071000000115 1
To which I have applied the following code:
gen GFR=0
replace GFR=1 if medcodeid==976481000006110|1942831000006114|545152 1000006118|371441000000114|12621921000006110|18549 91000006119|8352981000006116|1540241000006111|1332 05018|8250311000006118|1942821000006111|1268044100 0006112|1744631000006112|8069731000006118|12621931 000006112|1866321000006117| 8294821000006118
As you can see, the medcodeid for obs 1 (380389013) is not found in the above list, and yet GFR is tagged as "1".
a) Could someone possible point out my error?
b) If you had a magical way of applying the medcodeid label (ie "GFR"/"CKD") to dataset 1, without having to manually copy the medcodeids from dataset 2 and re-write them into a list, that would be even better!
Thanks very much.
Jemima
Comment