the results of melogit

Patrick Fang

Join Date: Jun 2014

Posts: 21
#1

the results of melogit

25 Aug 2014, 17:45

Hi, listers

I want to fit a mixed-effect model and the dependent variable is a proportion. The code is:

Code:

meglm dependent_variable independent_variables:id:, family(binomial) link(logit)

However, stata complained that:
"outcome does not vary; remember:
0 = negative outcome,
all other nonmissing values = positive outcome "

What should I do and How can I fit such a model with a continuous proportional dependent variable?

Last edited by Patrick Fang; 25 Aug 2014, 18:42.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

25 Aug 2014, 18:17

First do what Stata asked you to do: check the distribution of your dependent variable: if it is coded 1/2 instead of 0/1, for example, Stata will interpret that as meaning that everybody had a positive outcome. So run it again and then start with -tab dependent_variable if e(sample), nolabel-. If there are two different values but neither is zero, then you will have to -recode- your dependent variable. If there is only one value, you will need to find out what is wrong with your data, or why the outcome does not vary in the subset of the data having no missing values among the model variables.
Comment
ben earnhart

Join Date: May 2014

Posts: 1027
#3

25 Aug 2014, 18:18

Well, the "link(logit)" option is telling it you have a 0/1 outcome, not a proportion. So that alone would explain your error. What family and link function are truly appropriate depends on the distribution of your outcome.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#4

25 Aug 2014, 18:25

I suspect that you have continuous proportions, all positive, such as 0.01 or 0.42. The command you have used doesn't support that kind of data.
Comment
Patrick Fang

Join Date: Jun 2014

Posts: 21
#5

25 Aug 2014, 18:26

Originally posted by Clyde Schechter View Post

First do what Stata asked you to do: check the distribution of your dependent variable: if it is coded 1/2 instead of 0/1, for example, Stata will interpret that as meaning that everybody had a positive outcome. So run it again and then start with -tab dependent_variable if e(sample), nolabel-. If there are two different values but neither is zero, then you will have to -recode- your dependent variable. If there is only one value, you will need to find out what is wrong with your data, or why the outcome does not vary in the subset of the data having no missing values among the model variables.

Dr. Schechter and Dr. Earnhart ,thanks for your responses. I want to fit a model whose dependent variable is a proportion( between 0 and 1) . How can I fit this type of model?
Comment
Patrick Fang

Join Date: Jun 2014

Posts: 21
#6

25 Aug 2014, 18:28

Dr. Schechter and Dr. Earnhart ,thanks for your responses. I want to fit a model whose dependent variable is a proportion( between 0 and 1) . How can I fit this type of model?
Comment
Patrick Fang

Join Date: Jun 2014

Posts: 21
#7

25 Aug 2014, 18:35

Originally posted by Nick Cox View Post

I suspect that you have continuous proportions, all positive, such as 0.01 or 0.42. The command you have used doesn't support that kind of data.

Yes, Dr Cox, the dependent variable is continuous proportion.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#8

25 Aug 2014, 18:41

Well, if your dependent variable is a proportion that was actually arrived at by dividing a numerator by a denominator, then you want to get the numerator and denominator as variables and model with

Code:

melogit numerator independent_variables || id:, binomial(denominator)

If you do not have, and cannot get, the actual numerator and denominator, then you will just have to treat the dependent_variable as an ordinary continuous variable. Your regression model will probably be based on -mixed-, and you may need to explore transformations of your dependent variable in order to get a sensible model, depending on the distribution of the dependent variable. And you should not expect the results to closely resemble what you would have gotten from a true logistic model based on the numerator and denominator because, for example, the cases where the denominator is large "carry more weight" in a logistic regression, whereas in an ordinary linear regression they will count the same as all other cases. (Yes, you can modify that by assigning weights, but the coherent way to assign weights would be based on the denominators, which you don't know if you are using this approach!) Not to mention scaling issues.
Comment
Patrick Fang

Join Date: Jun 2014

Posts: 21
#9

25 Aug 2014, 18:47

Dr. Schechter, I really appreciate your response.
Comment
Andrew Lover

Join Date: Apr 2014

Posts: 182
#10

25 Aug 2014, 18:57

You might explore Maarten Buis' -FMLOGIT- (SSC) which will allow modeling proportions, but doesn't include a multilevel flavor. Maarten will likely know far more about other options!

__________________________________________________ __
Assistant Professor, Department of Biostatistics and Epidemiology
School of Public Health and Health Sciences
University of Massachusetts- Amherst
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4987
#11

25 Aug 2014, 19:37

I am mildly surprised by the error. glm will work with proportions using family(binomial) and link(logit). Apparently meglm is more fussy.

At the Chicago Stata Users Conference, 2011, Jeff Wooldridge said "Many existing Stata commands could be used to estimate flexible fractional response models allowing for endogeneity
and unbalanced panel by removing the “data checks” on the response variable." It looks like meglm can be added to that list. See slide 6 of

http://www.stata.com/meeting/chicago...wooldridge.pdf

In other words, many programs (e.g. logit, probit) would work perfectly fine if you got rid of the 0/nonzero check and instead just required that values had to range between 0 and 1. I actually have some beta software that does that for logit and probit and a few other models but I've never been ambitious enough to get it ready for SSC.

Alas I don't see an easy way to tweak meglm to work with fractional variables.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Sam Terman

Join Date: Apr 2021

Posts: 4
#12

08 Apr 2021, 14:39

When I try
melogit numerator independent_variables || id:, binomial(denominator) as Clyde indicated, I get the error
option binomial() invalid.

Thoughts?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#13

08 Apr 2021, 14:47

What version of Stata are you using? I think a few versions back, there were two mixed effects logistic regression commands, -melogit- and -meqrlogit-. And if I recall, one of them allowed the -binomial()- option and the other did not. So if you are using an older Stata, try -meqrlogit- instead.

If that doesn't work, I don't really know what else the problem could be.
Comment
Sam Terman

Join Date: Apr 2021

Posts: 4
#14

08 Apr 2021, 15:49

Thanks so much for the quick response. Version 16.0. Somehow I just ran it again I didn't get that error anymore. And the simple intercept-only model worked.

melogit num || bene_id:, binomial(denom) // var(_cons)=1.19973

Except when adding another variable like

melogit num ib1.medtypenum || bene_id:, binomial(denom)

it says initial values not feasible option

So I tried

melogit num ib1.medtypenum || bene_id:, binomial(denom) startgrid(.1 1 10)

which includes starting the grid search for the variance of the random intercept at 1 at one of those nodes which would seem like a good guess given var(_cons) for the above empty model that worked was about 1. But still, initial values not feasible. Even when I did startgrid(1.99731) at exactly the simpler model's var(_cons).

I tried

melogit num ib1.medtypenum || bene_id:, binomial(denom) noestimate

which gives me fixed effects coefficient, but not sure what more information that's giving.

So I tried

melogit num ib1.medtypenum || bene_id:, binomial(denom) intmethod(laplace)

which immediately spits out alot of red text like J(): 3900 unable to allocate real etc etc

So...I'm not sure what else to do.

I could use

glm y x, link(logit) family(binomial) robust nolong

But that doesn't account for the denominators (it considers y a proportion, just the point estimate) and doesn't include the multilevel structure I was hoping for.

Thoughts about how to make the melogit converge and coax it into feasible starting values?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#15

08 Apr 2021, 17:08

These convergence issues are difficult, and I think the initial values problems are the toughest of all. I really don't know what to tell you. I hope somebody else has some ideas; I'd love to learn additional approaches, as I encounter these problems from time to time myself, and usually end up having to just change the model. But in your situation, that doesn't seem like much of an option.
Comment

Announcement

the results of melogit

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment