Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xthybrid vs meglm

    Dear all,

    I want to calculate the average treatment effect within panel data with a panel of 3 time points and about 178 individuals for ordinal outcomes (scale of 4 or 5). The reason I mainly consider fixed-effects is that the treatment and the control group were not selected at random. Therefore I am concerned about unobserved heterogeneity.

    My first approach is the hybrid model using the stata command 'xthybrid' after 'xtset id wave' in the paper from Schunck and Perales (2017). In general that worked pretty well, but I had problems with 'xtsum[...] if e(sample)' as there appeared a table without any numbers and "xtgraph[...] if e(sample)" where the error message "__000002 not found" ocurred.
    Therefore my second approach is the recommendation of Allison (2009) to generate variables for within and between cluster effects on my own using, e.g.
    egen M_a = mean(a), by(id) - between cluster effects
    gen F_a = a - M_a - within cluster effects

    Afterwards I ran the regression with the generated variables with 'meglm' using Stata/SE 15.1.

    So my main question is:

    1) What are the main differences between the calculation with

    xthybrid[...], family(ordinal) link(logit) vce(robust) clusterid(id) full

    and

    meglm[...], family(ordinal) link(logit) vce(robust) ||id:

    because the results are very close to each other but somewhat different (see Screenshot). What are the reasons/issues I have to consider, if I want to decide which calculation is more appropriate in a certain case?

    2) Are there any other recommendations how I should calculate Fixed-Effects for ordinal variables with a panel of 3 time points and about 178 individuals? The only suggestion apart from xthybrid and meglm I found so far is to split the ordinal outcome to more equations and treat the ordinal outcome in each equation as binary.

    Thank you very much in advance and all the best,
    Sigi


    Literature:
    Allison, P. D. (2009). Fixed effects regression models (Vol. 160). SAGE publications.
    Schunck, R., & Perales, F. (2017). Within-and between-cluster effects in generalized linear mixed models: A discussion of approaches and the xthybrid command. Stata Journal, 17(1), 89-115.
    Click image for larger version

Name:	xthybrid_vs_meglm.png
Views:	1
Size:	101.6 KB
ID:	1452662

    Last edited by Siegfried Eisenberg; 10 Jul 2018, 06:00.

  • #2
    As luck would have it, I was estimating Allison's hybrid models yesterday, and I thought, Gee, somebody should write a program to do this. And now, thanks to you, I know that there is one.

    My guess is you computed the means and difference variables incorrectly. xthybrid (I think) calculates them using only the non-missing cases in the model. If you calculated them using all the cases, their values will be a little different. You therefore need to do something like

    Code:
    gen mysample = !missing(y, x1, x2, x3,...., idvar)
    egen M_a = mean(a) if mysample, by(id)
    If this doesn't cause xthybrid and meglm to produce identical results, let us know. Like I say, I've been using xthybrid for the better part of an hour, so I may be missing something.

    I can already see that there are advantages to doing it yourself rather than using xthybrid. xthybrid does not support factor variables. Many post-estimation commands, like predict and margins, do not work correctly, because xthybrid deletes all the temporary variables it created. If I were the authors I would add an option to keep them. Depending on what output you request it will report a T value for the random variances, which I think is not legitimate and Stata itself does not do this.

    But overall, it seems like a very useful command. I might use it when developing my models and then switch to do-it-myself for the final product.

    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    StataNow Version: 19.5 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #3
      Originally posted by Richard Williams View Post
      I can already see that there are advantages to doing it yourself rather than using xthybrid. xthybrid does not support factor variables. Many post-estimation commands, like predict and margins, do not work correctly, because xthybrid deletes all the temporary variables it created.
      I am not quite sure whether simply keeping the between variables would get predict and margins to work correctly. As I understand it, the between variables are mechanically related to their respective within variables much like an interaction term is mechanically related to its constituting variables. There is no way that the between variable can change independently from the within variables (and vice versa) much like there is no way that an interaction term can change independently from its constituting variables (and vice versa). I believe that margins would probably need to account for these relationships to produce correct results but I cannot see an easy way of letting margins know about the respective relationships among the variables in the model.

      Edit:

      There are some difficulties concerning factor variable notation, too. Reinhard Schunck (2013), one of the authors of xthybrid, discusses pitfalls with interaction terms in so-called hybrid models.

      Thus, it might not be all that bad that you cannot simply use margins after xthybrid.

      Best
      Daniel


      Schunck, R. 2013. Within and between estimates in random-effects models: Advantages and drawbacks of correlated random effects and hybrid models. The Stata Journal, 13(1): pp. 65-76.
      Last edited by daniel klein; 10 Jul 2018, 09:37.

      Comment


      • #4
        Daniel makes some very good points and may have helped me avoid a lot of mistakes!

        One thing you can do with hybrid models is include time-invariant variables, e.g. gender. So, I assume margins would be ok for them. And, I suspect the predict command would be ok, but you need the original vars to do it. Perhaps other post-estimation commands, like estat icc. I will look at the Schunk article before I try anything I might regret.
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        StataNow Version: 19.5 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          Dear Mr Williams,

          Your guess was absolutely right, I calculated the means and difference variables incorrectly due to missing values. Now I get exactly the same results with both methods. Thank you so much!

          Thank you both for discussing post-estimation commands, because that's very helpful for me as well.

          All the best,
          Sigi
          Last edited by Siegfried Eisenberg; 10 Jul 2018, 10:43.

          Comment

          Working...
          X