Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • the results of melogit

    Hi, listers

    I want to fit a mixed-effect model and the dependent variable is a proportion. The code is:

    Code:
    meglm dependent_variable independent_variables:id:, family(binomial) link(logit)
    However, stata complained that:
    "outcome does not vary; remember:
    0 = negative outcome,
    all other nonmissing values = positive outcome "

    What should I do and How can I fit such a model with a continuous proportional dependent variable?
    Last edited by Patrick Fang; 25 Aug 2014, 18:42.

  • #2
    First do what Stata asked you to do: check the distribution of your dependent variable: if it is coded 1/2 instead of 0/1, for example, Stata will interpret that as meaning that everybody had a positive outcome. So run it again and then start with -tab dependent_variable if e(sample), nolabel-. If there are two different values but neither is zero, then you will have to -recode- your dependent variable. If there is only one value, you will need to find out what is wrong with your data, or why the outcome does not vary in the subset of the data having no missing values among the model variables.

    Comment


    • #3
      Well, the "link(logit)" option is telling it you have a 0/1 outcome, not a proportion. So that alone would explain your error. What family and link function are truly appropriate depends on the distribution of your outcome.

      Comment


      • #4
        I suspect that you have continuous proportions, all positive, such as 0.01 or 0.42. The command you have used doesn't support that kind of data.

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          First do what Stata asked you to do: check the distribution of your dependent variable: if it is coded 1/2 instead of 0/1, for example, Stata will interpret that as meaning that everybody had a positive outcome. So run it again and then start with -tab dependent_variable if e(sample), nolabel-. If there are two different values but neither is zero, then you will have to -recode- your dependent variable. If there is only one value, you will need to find out what is wrong with your data, or why the outcome does not vary in the subset of the data having no missing values among the model variables.
          Dr. Schechter and Dr. Earnhart ,thanks for your responses. I want to fit a model whose dependent variable is a proportion( between 0 and 1) . How can I fit this type of model?

          Comment


          • #6

            Dr. Schechter and Dr. Earnhart ,thanks for your responses. I want to fit a model whose dependent variable is a proportion( between 0 and 1) . How can I fit this type of model?

            Comment


            • #7
              Originally posted by Nick Cox View Post
              I suspect that you have continuous proportions, all positive, such as 0.01 or 0.42. The command you have used doesn't support that kind of data.
              Yes, Dr Cox, the dependent variable is continuous proportion.

              Comment


              • #8
                Well, if your dependent variable is a proportion that was actually arrived at by dividing a numerator by a denominator, then you want to get the numerator and denominator as variables and model with

                Code:
                melogit numerator independent_variables || id:, binomial(denominator)
                If you do not have, and cannot get, the actual numerator and denominator, then you will just have to treat the dependent_variable as an ordinary continuous variable. Your regression model will probably be based on -mixed-, and you may need to explore transformations of your dependent variable in order to get a sensible model, depending on the distribution of the dependent variable. And you should not expect the results to closely resemble what you would have gotten from a true logistic model based on the numerator and denominator because, for example, the cases where the denominator is large "carry more weight" in a logistic regression, whereas in an ordinary linear regression they will count the same as all other cases. (Yes, you can modify that by assigning weights, but the coherent way to assign weights would be based on the denominators, which you don't know if you are using this approach!) Not to mention scaling issues.

                Comment


                • #9
                  Dr. Schechter, I really appreciate your response.

                  Comment


                  • #10
                    You might explore Maarten Buis' -FMLOGIT- (SSC) which will allow modeling proportions, but doesn't include a multilevel flavor. Maarten will likely know far more about other options!
                    __________________________________________________ __
                    Assistant Professor, Department of Biostatistics and Epidemiology
                    School of Public Health and Health Sciences
                    University of Massachusetts- Amherst

                    Comment


                    • #11
                      I am mildly surprised by the error. glm will work with proportions using family(binomial) and link(logit). Apparently meglm is more fussy.

                      At the Chicago Stata Users Conference, 2011, Jeff Wooldridge said "Many existing Stata commands could be used to estimate flexible fractional response models allowing for endogeneity
                      and unbalanced panel by removing the “data checks” on the response variable." It looks like meglm can be added to that list. See slide 6 of

                      http://www.stata.com/meeting/chicago...wooldridge.pdf

                      In other words, many programs (e.g. logit, probit) would work perfectly fine if you got rid of the 0/nonzero check and instead just required that values had to range between 0 and 1. I actually have some beta software that does that for logit and probit and a few other models but I've never been ambitious enough to get it ready for SSC.

                      Alas I don't see an easy way to tweak meglm to work with fractional variables.
                      -------------------------------------------
                      Richard Williams, Notre Dame Dept of Sociology
                      Stata Version: 17.0 MP (2 processor)

                      EMAIL: [email protected]
                      WWW: https://www3.nd.edu/~rwilliam

                      Comment


                      • #12
                        When I try
                        melogit numerator independent_variables || id:, binomial(denominator) as Clyde indicated, I get the error
                        option binomial() invalid.

                        Thoughts?

                        Comment


                        • #13
                          What version of Stata are you using? I think a few versions back, there were two mixed effects logistic regression commands, -melogit- and -meqrlogit-. And if I recall, one of them allowed the -binomial()- option and the other did not. So if you are using an older Stata, try -meqrlogit- instead.

                          If that doesn't work, I don't really know what else the problem could be.

                          Comment


                          • #14
                            Thanks so much for the quick response. Version 16.0. Somehow I just ran it again I didn't get that error anymore. And the simple intercept-only model worked.

                            melogit num || bene_id:, binomial(denom) // var(_cons)=1.19973

                            Except when adding another variable like

                            melogit num ib1.medtypenum || bene_id:, binomial(denom)

                            it says initial values not feasible option

                            So I tried

                            melogit num ib1.medtypenum || bene_id:, binomial(denom) startgrid(.1 1 10)

                            which includes starting the grid search for the variance of the random intercept at 1 at one of those nodes which would seem like a good guess given var(_cons) for the above empty model that worked was about 1. But still, initial values not feasible. Even when I did startgrid(1.99731) at exactly the simpler model's var(_cons).

                            I tried

                            melogit num ib1.medtypenum || bene_id:, binomial(denom) noestimate

                            which gives me fixed effects coefficient, but not sure what more information that's giving.

                            So I tried

                            melogit num ib1.medtypenum || bene_id:, binomial(denom) intmethod(laplace)

                            which immediately spits out alot of red text like J(): 3900 unable to allocate real etc etc

                            So...I'm not sure what else to do.

                            I could use

                            glm y x, link(logit) family(binomial) robust nolong

                            But that doesn't account for the denominators (it considers y a proportion, just the point estimate) and doesn't include the multilevel structure I was hoping for.

                            Thoughts about how to make the melogit converge and coax it into feasible starting values?

                            Comment


                            • #15
                              These convergence issues are difficult, and I think the initial values problems are the toughest of all. I really don't know what to tell you. I hope somebody else has some ideas; I'd love to learn additional approaches, as I encounter these problems from time to time myself, and usually end up having to just change the model. But in your situation, that doesn't seem like much of an option.

                              Comment

                              Working...
                              X