Hi everyone,
I have been facing this problem for a quite couple of times now & would appreciate your guidance & help in regard it please.
I have a variable in my data set called BMI (Body mass index), even though it is supposed to be numeric but I believe due to multiple mistakes/errors while entering the values in the excel sheet where the data is being collected, ultimately STATA read the variable as string variable rather than a numeric variable.
so because I have around 2000 individuals with their BMI, & its hard to manully correct all of the BMI values, I used the following syntax:
encode BMI , gen (BMI_numric)
order BMI_numric, after (BMI)
label dir
codebook BMI BMI_numric
the problem is that their is a "value label" for the newly n=generated numeric variable "BMI_numeric", where this "value label" have the values I want them to be for my new variable.
what is happening that my numeric variable is taking values of (1,2,3,4... 2000) in a serial ascending manner for all observations, instead of taking the value that is equal to the "value label"
how can I correct this please ? & are my syntax correct to use or not ?
here is 10 observations from my data, I posted the stata syntax & I wrote theoutput in an excel & then copied & pated here, as I didnot know how to post the output with a nice & neet display here? :
first: with the "value label" for my "newly generated numeric variable", and this is the way I wanted to look like:
list BMI BMI_numric in 1/10
Second: Here is the newly generated numeric variable, actual values without the value label, which is not what I want or need:
list BMI BMI_num in 1/10 , nolabel
Thank you & your guidance are highly appreciated.
Best Regards
I have been facing this problem for a quite couple of times now & would appreciate your guidance & help in regard it please.
I have a variable in my data set called BMI (Body mass index), even though it is supposed to be numeric but I believe due to multiple mistakes/errors while entering the values in the excel sheet where the data is being collected, ultimately STATA read the variable as string variable rather than a numeric variable.
so because I have around 2000 individuals with their BMI, & its hard to manully correct all of the BMI values, I used the following syntax:
encode BMI , gen (BMI_numric)
order BMI_numric, after (BMI)
label dir
codebook BMI BMI_numric
the problem is that their is a "value label" for the newly n=generated numeric variable "BMI_numeric", where this "value label" have the values I want them to be for my new variable.
what is happening that my numeric variable is taking values of (1,2,3,4... 2000) in a serial ascending manner for all observations, instead of taking the value that is equal to the "value label"
how can I correct this please ? & are my syntax correct to use or not ?
here is 10 observations from my data, I posted the stata syntax & I wrote theoutput in an excel & then copied & pated here, as I didnot know how to post the output with a nice & neet display here? :
first: with the "value label" for my "newly generated numeric variable", and this is the way I wanted to look like:
list BMI BMI_numric in 1/10
BMI | BMI_num |
34.69 | 34.69 |
29 | 29 |
30.98 | 30.98 |
25.98 | 25.98 |
18 | 18 |
48.42 | 48.42 |
26.18 | 26.18 |
27.6 | 27.6 |
21.3 | 21.3 |
25.15 | 25.15 |
Second: Here is the newly generated numeric variable, actual values without the value label, which is not what I want or need:
list BMI BMI_num in 1/10 , nolabel
BMI | BMI_num |
34.69 | 509 |
29 | 329 |
30.98 | 407 |
25.98 | 216 |
18 | 15 |
48.42 | 673 |
26.18 | 223 |
27.6 | 277 |
21.3 | 67 |
25.15 | 189 |
Thank you & your guidance are highly appreciated.
Best Regards
Comment