Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • ICC calculation. xtreg & xtmixed results turn out to be differences

    I was trying to calculate the ICC to see whether it is needed for a multilevel analysis for my data. When I was running the xtreg and xtmixed in Stata 13.1, the two tests show different coefficient.

    outcome variable: w0mz
    exposure variable: rsppreg
    grouping variable: stationcode


    Code and output are as below:

    ################################################## ################################################## ######

    . xtset stationcode
    panel variable: stationcode (unbalanced)

    . xtreg w0mz rsppreg, mle nolog

    Random-effects ML regression Number of obs = 8238
    Group variable: stationcode Number of groups = 9

    Random effects u_i ~ Gaussian Obs per group: min = 384
    avg = 915.3
    max = 1846

    LR chi2(1) = 59.19
    Log likelihood = -11519.418 Prob > chi2 = 0.0000

    ------------------------------------------------------------------------------
    w0mz | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    rsppreg | -.0085439 .0006634 -12.88 0.000 -.0098441 -.0072437
    _cons | .2359468 . . . . .
    -------------+----------------------------------------------------------------
    /sigma_u | .0674822 .0320939 .0265685 .1714006
    /sigma_e | .9787393 .0076355 .9638878 .9938196
    rho | .0047314 .0044823 .0006123 .0251942
    ------------------------------------------------------------------------------
    Likelihood-ratio test of sigma_u=0: chibar2(01)= 13.55 Prob>=chibar2 = 0.000

    ################################################## ################################################## #######

    . xtmixed w0mz rsppreg || stationcode:, mle nolog

    Mixed-effects ML regression Number of obs = 8238
    Group variable: stationcode Number of groups = 9

    Obs per group: min = 384
    avg = 915.3
    max = 1846


    Wald chi2(1) = 910.59
    Log likelihood = -11146.214 Prob > chi2 = 0.0000

    ------------------------------------------------------------------------------
    w0mz | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    rsppreg | -.1956578 .0064839 -30.18 0.000 -.2083661 -.1829496
    _cons | 11.72928 .6660864 17.61 0.000 10.42377 13.03478
    ------------------------------------------------------------------------------

    ------------------------------------------------------------------------------
    Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
    -----------------------------+------------------------------------------------
    stationcode: Identity |
    sd(_cons) | 1.603851 .3820153 1.005587 2.558048
    -----------------------------+------------------------------------------------
    sd(Residual) | .9322329 .0072668 .9180985 .9465848
    ------------------------------------------------------------------------------
    LR test vs. linear regression: chibar2(01) = 759.96 Prob >= chibar2 = 0.0000


    To my understanding, the sigma_u and sd(_cons) should be equal; and the sigma_e and sd(Residual) should be equal.
    And the ICC based on the xtmixed result is ~76%.

    I got some missing value in w0mz and rsppreg, but it is already set as ".", should not be affected.

    Can anyone help?

    Thank you very much!









  • #2
    Jan:
    at a first glance, the fixed part of your mixed model also differs from the coefficients displayed in -xtreg, mle- table.
    If I look at the result of the following code, the icc values calculated folowing -xtreg, mle- and -xtmixed- are indeed the same:
    Code:
    use http://www.stata-press.com/data/r13/pefrate, clear
    xtreg wm , mle 
    mixed wm || id:, mle stddeviations
    estat icc
    Kind regards,
    Carlo
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Carlo,

      Thank you very much.

      Actually I got the same results from xtreg and xtmixed when I analysis another dataset. I just don't know why they are different in this dataset.

      Best,
      Jan

      Comment


      • #4
        Jan:
        a temptative strategy to sniff out the culprit (if any) is to run your models by starting out with one predictor only and then add the other ones one at time.
        Kind regards,
        Carlo
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Carlo:

          I tried what you suggested, and found some interesting results as below.

          I only include the outcome, i.e. w0mz. And the coef turn out to be the same
          xreg gives me 0 for sigma_u and 0.9869944 for sigma_e
          xtmixed gives me 1.25e-10 for sd(_cons) and 0.9869944 sd(Residual)


          I guess the 0 is because of the decimal points.


          However, if I add in the predictor, i.e. rsppreg, as I post at the beginning, the coef turn out to be different.

          Then I do two tests with another two predictors, separately (result not shown)
          (1) I generate a new variable called random, which is just all random numbers generate in excel. It turns out the coef are the same again.

          (2) I generate a new variable called formula, which is calculated based on the stationcode and w0mz. It turns out the coef are different.

          Is it because of the data itself? Is there any rare situation that the coef could be different? I am confused.

          Thank you very much!

          Best,
          Jan

          ################################################## #####################
          . xtset stationcode
          panel variable: stationcode (unbalanced)

          . xtreg w0mz, mle nolog

          Random-effects ML regression Number of obs = 8257
          Group variable: stationcode Number of groups = 9

          Random effects u_i ~ Gaussian Obs per group: min = 384
          avg = 917.4
          max = 1854

          Wald chi2(0) = 0.00
          Log likelihood = -11608.084 Prob > chi2 = .

          ------------------------------------------------------------------------------
          w0mz | Coef. Std. Err. z P>|z| [95% Conf. Interval]
          -------------+----------------------------------------------------------------
          _cons | -.2612718 .0108618 -24.05 0.000 -.2825607 -.239983
          -------------+----------------------------------------------------------------
          /sigma_u | 0 .0128533 . .
          /sigma_e | .9869944 .0076805 .9720551 1.002163
          rho | 0 (omitted)
          ------------------------------------------------------------------------------
          Likelihood-ratio test of sigma_u=0: chibar2(01)= 0.00 Prob>=chibar2 = 1.000


          ################################################## ###################
          . xtmixed w0mz || stationcode:, mle nolog

          Mixed-effects ML regression Number of obs = 8257
          Group variable: stationcode Number of groups = 9

          Obs per group: min = 384
          avg = 917.4
          max = 1854


          Wald chi2(0) = .
          Log likelihood = -11608.084 Prob > chi2 = .

          ------------------------------------------------------------------------------
          w0mz | Coef. Std. Err. z P>|z| [95% Conf. Interval]
          -------------+----------------------------------------------------------------
          _cons | -.2612718 .0108618 -24.05 0.000 -.2825607 -.239983
          ------------------------------------------------------------------------------

          ------------------------------------------------------------------------------
          Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
          -----------------------------+------------------------------------------------
          stationcode: Identity |
          sd(_cons) | 1.25e-10 8.69e-10 1.59e-16 .0000989
          -----------------------------+------------------------------------------------
          sd(Residual) | .9869944 .0076805 .9720551 1.002163
          ------------------------------------------------------------------------------
          LR test vs. linear regression: chibar2(01) = 0.00 Prob >= chibar2 = 1.0000




          Comment


          • #6
            Jan:
            unfortunately, I cannot reproduce your problem.
            As another temptative strategy, after running each model I would investigate if the models omit the same missing observations:
            Code:
            xtreg w0mz rsppreg, mle nolog
            g Flag_xtreg=1 if e(sample)==. 
            xtmixed w0mz rsppreg || stationcode:, mle nolog
            g Flag_xtmixed=1 if e(sample)==.
            list Flag_* if Flag_*==1
            Kind regards,
            Carlo
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Carlo,

              Thank you so much. Below are the results I got after checking the missing observations.
              They do generated same amount of missing values, but 8300 is also the total number of ALL my observations!
              Is this where it goes wrong?

              . g Flag_xtreg=1 if e(sample)==.
              (8300 missing values generated)


              . g Flag_xtmixed=1 if e(sample)==.
              (8300 missing values generated)

              Best,
              Jan

              Comment


              • #8
                Jan:
                this result says that your e(sample) (i.e., the sample that is included in your calculation) is composed of 8300 observations with no missing values for both models.
                But 8257 observations seems to be included in your models,
                I would re-run the models and see what happens.
                Kind regards,
                Carlo
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Carlo,

                  I found that if I use rsppreg as individual level and stationcode as group level, the results would always different.
                  and when I use another variable as group level, the results turn out to be the same.
                  I think it is because the rsppreg is estimated based on the data from each station, which indicated a intrinsic relationship between individual level and group level.
                  Not sure whether this relationship is appropriate for multilevel analysis or not.

                  Thank you very much for helping!

                  Best,
                  Jan

                  Comment


                  • #10
                    Jan,
                    have you calculated the correlation between rsppreg and stationcode?
                    Kind regards,
                    Carlo
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment


                    • #11
                      Carlo,

                      Is it like this?

                      Code:
                      . cor stationcode rsppreg
                      (obs=8241)

                      | stationcode rsppreg
                      ------------------+----------------------------------
                      stationcode | 1.0000
                      rsppreg | -0.0104 1.0000

                      Regards,
                      Jan

                      Comment

                      Working...
                      X