Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • deviance in -xtgee-

    Hi -- -xtgee- returns a variety of values related to the deviance. I could not find documentation on how the deviance is calculated in a -xtgee- model. Deviance is a likelihood concept and as such is not technically available in a GEE model. Is this just the deviance assuming independence after the model is estimated using GEE with the specified correlation structure, even if that correlation is not independence? -- Paul

  • #2
    Refer to the PDF manual entry of the command: https://www.stata.com/manuals13/xtxtgee.pdf. Such issues are usually discussed there.

    For probit estimation, xtgee with corr(independent) and probit will produce the same coefficients, but the standard errors will be only asymptotically equivalent because probit is not the canonical link for the binomial family. If the binomial denominator is not 1, the equivalent maximum-likelihood command is bprobit; see [R] probit and [R] glogit.
    If the xtgee command is equivalent to another command, using corr(independent) and the vce(robust) option with xtgee corresponds to using the vce(cluster clustvar) option in the equivalent command, where clustvar corresponds to the panel variable.
    The identity link and Gaussian family produce regression-type models. With the independent correlation structure, we reproduce ordinary least squares. With the exchangeable correlation structure, we produce an equal-correlation linear regression estimator. xtgee, fam(gauss) link(ident) corr(exch) is asymptotically equivalent to the weighted-GLS estimator provided by xtreg, re and to the full maximum-likelihood estimator provided by xtreg, mle. In balanced data, xtgee, fam(gauss) link(ident) corr(exch) and xtreg, mle produce the same results.


    Originally posted by Paul Rathouz View Post
    Is this just the deviance assuming independence after the model is estimated using GEE with the specified correlation structure, even if that correlation is not independence?
    It should be easy to verify this! But as the last quote above shows, even with a different correlation structure, there is usually an equivalent ML estimator for the model.
    Last edited by Andrew Musau; 06 Jun 2025, 03:24.

    Comment


    • #3
      Thank you, Andrew -- Yes, I noted many of those posts. I was still hoping for some statement in the documentation what is a bit more direct about exactly what is being computed.! -- P

      Comment


      • #4
        Originally posted by Paul Rathouz View Post
        I was still hoping for some statement in the documentation what is a bit more direct about exactly what is being computed.
        For the binomial log-likelihood under the logistic distribution, where \(y_i\) denotes the outcome, the squared deviance residual \(d_{i}^{2}\)​ can be expressed as:

        \[
        d_i^2 =
        \begin{cases}
        \left( \sqrt{2 \cdot \log\left( \frac{1}{\hat{p}_i} \right)} \right)^2 & \text{if } y_i = 1 \\
        \left( \sqrt{2 \cdot \log\left( \frac{1}{1 - \hat{p}_i} \right)} \right)^2 & \text{if } y_i = 0
        \end{cases}
        \]

        So manually, we can calculate deviance after xtgee (family= binomial, link= logit) as follows:

        Code:
        webuse nlswork2, clear
        gen hiwage= ln_wage> 1.6
        xtgee hiwage grade age c.age#c.age, family(binomial) link(logit) corr(exchangeable) nolog
        display e(deviance)
        
        *CALCULATE DEVIANCE BY HAND
        predict res, mu
        gen double dev_res2= (cond(hiwage, sqrt(2 * log(1 / res)), sqrt(2 * log(1 / (1 - res)))))^2 if e(sample)
        qui sum dev_res2
        display "deviance `:di %10.3f `r(sum)''"
        Res.:

        Code:
        . xtgee hiwage grade age c.age#c.age, family(binomial) link(logit) corr(exchangeable) nolog
        
        GEE population-averaged model                       Number of obs    =  16,085
        Group variable: idcode                              Number of groups =   3,913
        Family: Binomial                                    Obs per group:  
        Link:   Logit                                                    min =       1
        Correlation: exchangeable                                        avg =     4.1
                                                                         max =       9
                                                            Wald chi2(3)     = 1422.33
        Scale parameter = 1                                 Prob > chi2      =  0.0000
        
        ------------------------------------------------------------------------------
              hiwage | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
        -------------+----------------------------------------------------------------
               grade |   .3748418   .0150876    24.84   0.000     .3452707    .4044129
                 age |   .7876085   .0448166    17.57   0.000     .6997695    .8754475
                     |
         c.age#c.age |  -.0133981   .0008757   -15.30   0.000    -.0151144   -.0116819
                     |
               _cons |  -15.86915   .5914498   -26.83   0.000    -17.02837   -14.70993
        ------------------------------------------------------------------------------
        
        .
        . display e(deviance)
        19406.198
        
        .
        .
        .
        . *CALCULATE DEVIANCE BY HAND
        
        .
        . predict res, mu
        (9 missing values generated)
        
        .
        . gen double dev_res2= (cond(hiwage, sqrt(2 * log(1 / res)), sqrt(2 * log(1 / (1 - res)))))^2 if e(sample)
        (9 missing values generated)
        
        .
        . qui sum dev_res2
        
        .
        . display "deviance `:di %10.3f `r(sum)''"
        deviance  19406.198
        So depending on the model, you can define the formula for calculating deviance.
        Last edited by Andrew Musau; 07 Jun 2025, 09:03.

        Comment


        • #5
          What I call "res" in

          gen double dev_res2= (cond(hiwage, sqrt(2 * log(1 / res)), sqrt(2 * log(1 / (1 - res)))))^2 if e(sample)
          is the predicted probability \(\hat{p}_i\) in the equation in #4, and this is given by the -mu- option of predict after xtgee.

          predict [type] newvar [if] [in] [, statistic nooffset]

          statistic Description
          -------------------------------------------------------------------------------------
          Main
          mu predicted value of depvar; considers the offset() or exposure();
          the default
          Therefore, to avoid confusion—since "res" might be interpreted as residual—I should rename it to "prob" for probability or "pred" for predicted value. However, this does not affect the calculation in any way.

          Code:
          webuse nlswork2, clear
          gen hiwage= ln_wage> 1.6
          xtgee hiwage grade age c.age#c.age, family(binomial) link(logit) corr(exchangeable) nolog
          display e(deviance)
          
          *CALCULATE DEVIANCE BY HAND
          predict prob, mu
          gen double dev_res2= (cond(hiwage, sqrt(2 * log(1 / prob)), sqrt(2 * log(1 / (1 - prob)))))^2 if e(sample)
          qui sum dev_res2
          display "deviance `:di %10.3f `r(sum)''"

          Comment

          Working...
          X