You are not logged in. You can browse but not post. Login or Register by clicking 'Login or Register' at the top-right of this page. For more information on Statalist, see the FAQ.
My take on this is that it is bad enough that you have these censored observations to start with. It may be that categorizing them using the detection limit as cutoff is the best you can make of a bad situation. But nothing requires you to do this for the calories variable, and doing so just takes a bad situation and makes it even worse. There is no value in "homogenizing" the variables in this way. It gains you nothing and it throws away useful information.
Thank you for your reply. So, what is the best approach to deal with censored observations ? substitution with detection limit/2 ? is there any Stata tool to estimate the value of the left censored observations ?
That question does not have a general answer. It depends in detail on the specifics of the variables themselves and how they are expected to relate to the outcome in question.
In most of the work that I do, when I encounter measure where some observations are below the limit of detection, that lower limit is usually low enough, and the range of variation of the observed detected values wide enough, that, for practical purposes, I can treat the left censored observations as if they were 0. When I do that, I usually also do a sensitivity analysis where I set the left censored observations at the actual detection limit. So far in my work, the results do not change to any meaningful extent between these two approaches. If they did, then I would have to look into other ways of dealing with it.
By the way, to be clear, I am not necessarily disagreeing with dichotomizing those variables and cutting them off at the limit of detection. I can easily imagine circumstances where that would be a very reasonable thing to do--and perhaps your circumstances are among those. What I am disagreeing with is dichotomizing the calories variable. It is not censored, there is no advantage to making it dichotomous (or, if in your special context there is, you haven't said what it might be) and it is clearly going to distort and weaken your analysis.
That question does not have a general answer. It depends in detail on the specifics of the variables themselves and how they are expected to relate to the outcome in question.
In most of the work that I do, when I encounter measure where some observations are below the limit of detection, that lower limit is usually low enough, and the range of variation of the observed detected values wide enough, that, for practical purposes, I can treat the left censored observations as if they were 0. When I do that, I usually also do a sensitivity analysis where I set the left censored observations at the actual detection limit. So far in my work, the results do not change to any meaningful extent between these two approaches. If they did, then I would have to look into other ways of dealing with it.
By the way, to be clear, I am not necessarily disagreeing with dichotomizing those variables and cutting them off at the limit of detection. I can easily imagine circumstances where that would be a very reasonable thing to do--and perhaps your circumstances are among those. What I am disagreeing with is dichotomizing the calories variable. It is not censored, there is no advantage to making it dichotomous (or, if in your special context there is, you haven't said what it might be) and it is clearly going to distort and weaken your analysis.
Comment