Dear STATA users
I have a problem where STATA either adds or subtracts 0.000001/0.0000001 to a seemingly random set of my numeric values.
I have read several forum posts on topics of STATA adding untrue decimal points, but I have been unable to find a solution after several hours.
The data is entirely numeric in the format x.xx (blood sample concentration).
I have tried the following:
- Loading the variables as string using import delimited with stringcols
- Converting to numeric by either
E.g. one value of 0.95 is converted to 0.94999999 - and I cannot figure out why exactly this value got the addition/subtraction while others did not.
The only pattern is that some values observed multiple times are all converted by the same, e.g. 1.18 --> 1.1799999 and 1.19 --> 1.1900001
The original data has exactly two decimal points, so it is not possible that e.g. 1.1900001 is an actual observed value. It is added/subtracted by STATA for some reason.
I need the variables to be numeric in order to subgroup the values into three groups. I do this by:
generate hyp_cal_blood_0_0 = 0 if blood_0_ca_uncorrected<=1.32 & blood_0_ca_uncorrected<.
generate hyp_cal_blood_0_1 = 1 if blood_0_ca_uncorrected>=1.33 & blood_0_ca_uncorrected<=1.46 & blood_0_ca_uncorrected<.
generate hyp_cal_blood_0_2 = 2 if blood_0_ca_uncorrected>=1.47 & blood_0_ca_uncorrected<=2.00 & blood_0_ca_uncorrected<.
generate hyp_cal_blood_0_3 = 3 if blood_0_ca_uncorrected>=2.01 & blood_0_ca_uncorrected<.
egen float hyp_cal_blood_0 = rowtotal(hyp_cal_blood_0_0 hyp_cal_blood_0_1 hyp_cal_blood_0_2 hyp_cal_blood_0_3) if blood_0_ca_uncorrected<.
drop hyp_cal_blood_0_0 hyp_cal_blood_0_1 hyp_cal_blood_0_2 hyp_cal_blood_0_3
Consequently, an addition/subtraction can lead the to an observation ending up in the wrong group - hence why this is important.
Question:
Is there a way to avoid the addition/subtraction?
Alternatively - can I force STATA to round the value back to two decimal points?
Version
I use STATA 16.1
Kind regards,
Mikael
I have a problem where STATA either adds or subtracts 0.000001/0.0000001 to a seemingly random set of my numeric values.
I have read several forum posts on topics of STATA adding untrue decimal points, but I have been unable to find a solution after several hours.
The data is entirely numeric in the format x.xx (blood sample concentration).
I have tried the following:
- Loading the variables as string using import delimited with stringcols
- Converting to numeric by either
gen hyp_cal_blood_0_num = real(blood_0_ca_uncorrected)This yields the same problem: 19 out of 108 observations had 0.000001/0.0000001 added or subtracted.
OR
destring blood_0_ca_uncorrected cacl_peak, replace force float
E.g. one value of 0.95 is converted to 0.94999999 - and I cannot figure out why exactly this value got the addition/subtraction while others did not.
The only pattern is that some values observed multiple times are all converted by the same, e.g. 1.18 --> 1.1799999 and 1.19 --> 1.1900001
The original data has exactly two decimal points, so it is not possible that e.g. 1.1900001 is an actual observed value. It is added/subtracted by STATA for some reason.
I need the variables to be numeric in order to subgroup the values into three groups. I do this by:
generate hyp_cal_blood_0_0 = 0 if blood_0_ca_uncorrected<=1.32 & blood_0_ca_uncorrected<.
generate hyp_cal_blood_0_1 = 1 if blood_0_ca_uncorrected>=1.33 & blood_0_ca_uncorrected<=1.46 & blood_0_ca_uncorrected<.
generate hyp_cal_blood_0_2 = 2 if blood_0_ca_uncorrected>=1.47 & blood_0_ca_uncorrected<=2.00 & blood_0_ca_uncorrected<.
generate hyp_cal_blood_0_3 = 3 if blood_0_ca_uncorrected>=2.01 & blood_0_ca_uncorrected<.
egen float hyp_cal_blood_0 = rowtotal(hyp_cal_blood_0_0 hyp_cal_blood_0_1 hyp_cal_blood_0_2 hyp_cal_blood_0_3) if blood_0_ca_uncorrected<.
drop hyp_cal_blood_0_0 hyp_cal_blood_0_1 hyp_cal_blood_0_2 hyp_cal_blood_0_3
Consequently, an addition/subtraction can lead the to an observation ending up in the wrong group - hence why this is important.
Question:
Is there a way to avoid the addition/subtraction?
Alternatively - can I force STATA to round the value back to two decimal points?
Version
I use STATA 16.1
Kind regards,
Mikael
Comment