Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Getting an error message " Outcome does not vary" when I enter a command for binary logistic regression

    Hi All,

    I'm getting an error message when I try to run a logistic model with a binary independent variable against demographics and a sustainability index and I don't understand why.

    "​​​​​​logit cl_heard_b ib0.age_categories ib1.education ib6.income gender_binary_female ib0.political_affiliation_t_other cl_addindex, or"

    I hope someone can assist.

    Teresa

  • #2
    The most common cause of this problem is that Stata expects the binary response for -logit- to be coded 0 and non-0 (typically 0 and 1). If, for example, your response variable is coded 1 and 2, Stata will not see any 0s, and it will think that all of your outcomes are "positive, " i.e., non-zero.

    Comment


    • #3
      In addition to Mike Lacy's advice, which gives what is probably the commonest cause of this error, just bear in mind that the error message you are getting applies to the estimation sample, not the full data set. So it is possible to have your outcome properly coded as zeroes and ones, and still get this message because the estimation sample excludes a) any observation with a missing value for any of the variables mentioned in the command, and b) any observation that contains a value of an explanatory/predictor variable that perfectly predicts the outcome. Sometimes it happens that even though there are both 0 and 1 values of the outcome in the full data, after those exclusions only 0 outcome or only 1 outcome observations remain.

      Comment


      • #4
        Thank you Mike. My response vriable is coded as 0 and 1. However, if I remove the additive index variable which is an independent variable from the equation, it runs. I created the additive variable by combining 4 variables.

        Comment


        • #5
          Thank you Clyde, I think your reason (b) may be why I'm getting the error. Would you mind explaining what "any observation that contains a value of an explanatory/predictor variable that perfectly predicts the outcome" means? I think that's the case with my additive index variable. The additive variable worked when I ran an mlogit and all its coefficients are significant.

          Comment


          • #6
            Suppose you have a predictor variable, and whenever its value is > 5, the outcome variable takes on the value 1. This is an example of perfect prediction. (It is sometimes also called complete separation.) It is impossible to estimate a coefficient for such a variable in a logistic regression because the maximum likelihood estimate of the coefficient is infinity. Similar situations arise if the outcome variable always takes on the value of 0. Basically any situation in which a particular value or range of values of a predictor is associated in the sample with only 0 or only 1 outcomes fits this description.

            You don't have to try to guess whether you are encountering this situation. At the very beginning of the output, before the likelihood ratio iterations begin, Stata will print a message indicating that it has identified such a situation and explaining what corrective action (usually omission of the offending variable from the model and omission of observations having the offending value(s)) it has taken before proceeding with the analysis.

            Comment


            • #7
              Thank you Clyde, I need the index in my model. What other options do I have besides removing it? Does this mean that it's generally impossible to include composite additive index variables in a logit model?

              Comment


              • #8
                What other options do I have besides removing it? Does this mean that it's generally impossible to include composite additive index variables in a logit model?
                There is no general problem with additive indices in a logit model. It is some particular aspect of your index or your data that is causing this problem.

                There are a few options to consider. The first is to double-check that you calculated the additive index correctly in the first place. A second possibility is to modify it: perhaps some weighting will preserve most of whatever benefit you hope to gain from using this index but damp down the perfect prediction.

                The other question that comes to my mind is this: you said it worked well with -mlogit-. So why are you dichotomizing a multinomial outcome and using -logit-. You haven't shown any example data, nor the details of your analyses, but it sounds like you are taking an inherently multi-level outcome variable and discarding the more nuanced information it provides, squashing it all into two crude categories. Even if it didn't lead to this particular problem, that's almost always a bad idea anyway. So, unless you have a compelling reason to dichotomize, maybe you should just go back to your -mlogit- model.

                Comment


                • #9
                  Thank you Clyde. The mlogit is a separate model with a multinomial outcome. I have a dichotomous outcome as well that I'm running with the same independent variables with. I don't think there is anything wrong with my additive index variables. I'm okay to share my output here, but I'm not sure how. Also just to say, I tried creating quartiles from the index variable and including these in the model and still having the same problem.
                  Last edited by Teresa Mungazi; 30 Oct 2023, 14:41.

                  Comment


                  • #10
                    I think that talking in abstractions is unlikely to be fruitful. I have some ideas about what is going wrong, but to test them out I need example data from your Stata data set to work with. Please use the -dataex- command to post that here. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

                    Make sure your example includes all variables used in the logistic regression that is causing you trouble. And also be sure that when you run that logistic regression on the example data you get the same problem reproduced.

                    Comment


                    • #11
                      Hi Clyde, my apologies for the late response. I've resolved this one. The problem was with my data.

                      Kind Regards,

                      Teresa

                      Comment

                      Working...
                      X