Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Selection bias or Missing data at random on outcome ?

    Good morning everyone,

    I thank you for kindly enlightening me please
    I'm trying to study the influence of a variable X on a variable Y (DV). Both X and Y are binary ( Y/N).
    However, during the survey, the answers to the question (which I will call Q2) that represents my DV are conditioned by the answers to a previous question that I call Q1.
    If the answer to Q1 is "no" then Q2 is asked. If the answer is "yes" Q2 is not asked.

    The two questions are:
    Q1: Do you have a computer in your home?
    Q2: Do you have a tablet in your home?

    my question is:
    should I consider this as a selection bias or rather consider the Missing Data at Random (MAR) assumption on my DV and apply the imputation method. knowing that I saw in a previous topic that it was not desirable to do so.

    or another option to advise me?
    do you have any simple references to advise ?

    I have a lot of missing observations on DV (1500 /7000) and I don't understand why the two questions weren't asked because in my opinion there's no exclusion between owning a tablet and a computer

    Excuse me for the length.
    thank you for your help.
    Last edited by ABNGA MANU; 18 Oct 2019, 05:22.

  • #2
    Well, I share your puzzlement as to why they skipped Q2 if the answer to Q1 is no. But I have difficulty conjuring up any credible story that would make this turn out MAR; I would consider it selection bias (and a rather serious one, at that.)

    Comment


    • #3
      Thank you very much for your answer. It's really disconcerting when you think about all the information that's lost.

      1) I'm not an expert but i think , even I choose a MAR I can hardly apply imputation because the missing observations are for both the DV and the IVs. it is difficult to run mi logit in this case.
      2) I will therefore consider the "serious" selection bias and try to think about how to correct it.

      I thank you very much for taking the time to answer me and help me.
      Last edited by ABNGA MANU; 18 Oct 2019, 19:12.

      Comment


      • #4
        Clyde Schechter Good morning sir


        I come back to you regarding this problem of selection bias.

        could you help me please or any other person who would have the answer to my question

        there is another question that comes before the first two in the survey.

        Q-do you have any kind of electronic device?

        may I assume that those who answered "no" to the question do you have an electronic device also answered no to question 2 ( do you have a tablet?).
        since they don't have any electronic devices, that includes tablets.

        Only those who answered "YES" to the question "do you have an electronic device" were asked the question Q1. And only those who answered "no" to the question Q1 were asked the question Q2 ( which is my DV)


        The objective would be to reduce the number of missing variables in my DV

        Is this feasible/authorized?

        thank you for your help
        have a good day


        Comment


        • #5
          Originally posted by ABNGA MANU View Post
          may I assume that those who answered "no" to the question do you have an electronic device also answered no to question 2 ( do you have a tablet?).
          I think that would qualify as a plausible assumption. However, depending on who the data is on and when it was collected, I would be mildly surprised to see a larger fraction of respondents who do not own any electronic device. Therefore, either the answers to that first question might be false (e.g., clever respondents tend to recognize and try to skip filter-questions that are likely to trigger follow-up questions and, hence, lengthen the interview time) or the fraction of respondents will be too low to help you much with the number of missing values. Either way, even if you reduce the number of missing values, your results might still be badly biased. Remember: bias is function of both the fraction of missing values and the degree of similarity between respondents and non-respondents.

          Best
          Daniel

          Comment


          • #6
            daniel klein thank you very much for this helpful answer!

            Comment

            Working...
            X