Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Effect of missing data in Logistic regression

    Hi everyone

    I am currently using logistic regression for my analysis. Given that the NHANES (the data set am using) recommended that if a variable is missing less than 10% of values, it usually normal to carry on with the analysis without any further adjustment. However, while my individual predictor variables met this criteria (with number of respondents who had missing data and those who reported "don't know" in each variable amounting to less than 10%), including all the predictor in the multiple logistic regression model resulted in more than 10% of the missing data (see number of observation in the unadjusted and adjusted model below. Is it possible that this missing values may have altered the statistical significance of the final model? if so, how do I resolved it given that I am using survey data.




    Secondly, my secondary outcome variable had upto 1098 respondents coded as missing (that is "dot" in the survey data with little or no information about the nature of the missing) out of 7765 participants - which is more than 10%. Since the outcome variable is based on cut-off value, how do I manage this?

  • #2
    Chinonso:
    I cannot provide a different advice vs the one provided in my previoius reply to your very samer question (please, see https://www.statalist.org/forums/for...gression-model).
    I would only add that, unfortunately, is up to researcher (and not to database) investigating mechanism, pattern and ignorability of missing values and deciding which strategy to adopt to deal with them, according to her/his findings and the literature.
    Eventually, neither the unadjusted, nor the adjusted models are barely visible; please post what you typed and what Stata gave you back via CODE delimiters (as per FAQ). Thanks.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Carlo:

      Many thanks for your reply. Please, how can one use multiple imputation method to replace missing values on survey data? Again, how do I locate the code delimiters as I have not used that before now?

      Comment

      Working...
      X