Hi,
I'm new here so hope I am not repeating something that has already been asked. I searched for similar posts (and have been scouring Google) but couldn't find an answer. I am a newbie to Stata, so please bear with me as I'm learning.
I have a dataset with a variable for a blood biomarker. Most of the observations are absolute concentration of this biomarker, but some of the measurements were below the lower detection limit of the measurement method and have been assigned as "<5" (umol/L), so the variable is a string variable. I am trying to replace "<5" with something like "-1", so that I can destring the variable and know that everything with "-1" was below the detection limit. I generated a new variable (var) identical to the original to test a few things out.
I tried the following but it didn't work:
replace var = "-1" if var == "<5"
(0 real changes made)
I then tried the following, but it replaced all of the "<5" values with "."
destring var, force replace
I considered trying to remove the "<" symbol from the variable, then destringing and replacing the remaining "5"s, but there are some observations that are exactly 5 umol/L, so this won't work.
Does anyone know if there is a way to replace these observations collectively, or even a way to identify all instances of "<5" within the variable easily so that I can replace these one by one before destringing? I tried the 'list' command to find each instance of "<5" appearing for var, but this command doesn't seem to work for string variables. I noticed that the following does work, and will put "-1" in at row 106.
replace var = -1 in 106
So if I can't replace these collectively, I could do it one by one if I can find out where they all are.
I hope that makes sense, apologies if anything is unclear, and thank you for any help!
Jamie
I'm new here so hope I am not repeating something that has already been asked. I searched for similar posts (and have been scouring Google) but couldn't find an answer. I am a newbie to Stata, so please bear with me as I'm learning.
I have a dataset with a variable for a blood biomarker. Most of the observations are absolute concentration of this biomarker, but some of the measurements were below the lower detection limit of the measurement method and have been assigned as "<5" (umol/L), so the variable is a string variable. I am trying to replace "<5" with something like "-1", so that I can destring the variable and know that everything with "-1" was below the detection limit. I generated a new variable (var) identical to the original to test a few things out.
I tried the following but it didn't work:
replace var = "-1" if var == "<5"
(0 real changes made)
I then tried the following, but it replaced all of the "<5" values with "."
destring var, force replace
I considered trying to remove the "<" symbol from the variable, then destringing and replacing the remaining "5"s, but there are some observations that are exactly 5 umol/L, so this won't work.
Does anyone know if there is a way to replace these observations collectively, or even a way to identify all instances of "<5" within the variable easily so that I can replace these one by one before destringing? I tried the 'list' command to find each instance of "<5" appearing for var, but this command doesn't seem to work for string variables. I noticed that the following does work, and will put "-1" in at row 106.
replace var = -1 in 106
So if I can't replace these collectively, I could do it one by one if I can find out where they all are.
I hope that makes sense, apologies if anything is unclear, and thank you for any help!
Jamie
Comment