Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpretation of glm coefficients, gaussian family log link

    This is my first go round with glm. I'm fitting a model with meglm using the gaussian family and log link. I had previously fitted this model with xtreg and a logged dependent variable (badly skewed and overdispersed, logging greatly improved model fit).

    Now I understand that I am now logging the expected mean where as before I was modeling the mean of the logged observed values so the models are not exactly equivalent. However, I'm not sure how to interpret the resulting glm coefficients. Given a linear model with a logged outcome I would exponentiate the coefficient and interpret that as the percent change (i.e. 1.15 would be a 15% increase in Y resulting from a 1 unit increase in X). How do I interpret the coefficients, and, especially, how would I interpret the average marginal effects from the glm model? I also have two predictors that have been logged which, in the regular old linear model would have been elasticities but now I'm not sure what they are.

    This is stata 14. I've moved to meglm over xtreg and xtmixed because I intend to use the margins command and margins will integrate the random intercepts with the meglm command (xtmixed or xtreg will not do this).

    command as follows:
    meglm y x1 x2 ln(x3)... xn || clustervar: , family(gaussian) link(log)


    I'm also getting the error that "numerical derivatives are approximate
    nearby values are missing"
    Whereas the exact same command produced no error with xtreg. This would seem to indicate convergence problems, maybe collinearity is an issue?

    I assume I should plot predicted values vs. standardized residuals to check for model fit - run it with the identity link and the log link and see which looks better, correct? What about a gamma family? The outcome is not count data but it is all positive and greater than zero.

    Your advice is greatly appreciated.
    Last edited by Will Hauser; 15 Apr 2015, 09:31.

  • #2
    As a follow up after additional reading, suppose I specify the "eform" option, then the coefficients would be exponentiated and, I think, would reflect the difference in arithmetic means just as a regular old linear model would. The difference, I think, is that if the coefficients from a linear model with logged outcome were exponentiated then they reflect the change in the geometric mean. So the interpretation of coefficients from the models is almost but not quite identical. And I can interpret the logged predictor in the lognormal glm model as an elasticity as I described, I think.

    Is this correct? And, moreover, since margins doesn't seem to have an eform option, can I just specify an expression with the outcome exponentiated?
    margins, dydx(x) expression(exp(predict(mu)))

    Also, wouldn't the gamma family be a better fit? The outcome really isn't normally distributed - it's the sentence length an offender receives so it isn't exactly a count variable either but it is always a non zero positive number, mildly skewed and badly overdispersed.


    Last edited by Will Hauser; 15 Apr 2015, 22:00.

    Comment

    Working...
    X