Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    I am agreed with you on principle. I think that Stata is overly patronistic. She should be calculating whatever we are asking for and is calculable, and it should be up to us users to think whether the calculation makes sense or not.

    I was just saying that they give a good reason for the nesting.

    There is no theorem to claim that MLE and GLS estimates have to be numerically the same. However they are both consistent under the same set of conditions. Also it is super easy to just check for your dataset how different they are.


    Originally posted by paulvonhippel View Post
    Documentation aside, there are studies where you want to cluster on a variable that's not nested in the random effects. StataCorp was thinking about a particular data structure when they wrote the documentation, but the right way to cluster depends on the design of the study. In the study I'm working with, it's clear that having student random effects and clustering by teacher is desirable. So I was delighted to see the nonest option.

    I'm not sure if -xtreg, re- and -xtreg, re mle- give similar estimates in all data. They do in the nlswork data that you've used as an example, but they might give pretty different estimates when there are a lot of missing values.

    Comment


    • #17
      Joro Kolev, you write: "There is no theorem to claim that MLE and GLS estimates have to be numerically the same. However they are both consistent under the same set of conditions."

      Is that right? Are they both consistent when there are missing values on the dependent variables? Where can I read more about this?

      Comment


      • #18
        I think in this context both -xtreg, re- and xtreg, mle- do what is called complete case analysis, that is they include in the regression only rows for which both the dependent and all independent variables are not missing. As you can see in the regressions below 1) they are almost the same estimators whether or not I replace 90% of the dependent variable to missings at random 2) they have exactly the same number of observations whether or not I replace 90% of the dependent variable to missings at random.

        I do not think that -xtreg, mle- does anything special for missing values here. And they are both consistent under the random effects model assumptions.

        Code:
        . webuse nlswork, clear
        (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
        
        . xtset idcode
               panel variable:  idcode (unbalanced)
        
        . xtreg ln_wage age ttl_exp hours, re
        
        Random-effects GLS regression                   Number of obs     =     28,443
        Group variable: idcode                          Number of groups  =      4,709
        
        R-sq:                                           Obs per group:
             within  = 0.1373                                         min =          1
             between = 0.2590                                         avg =        6.0
             overall = 0.1801                                         max =         15
        
                                                        Wald chi2(3)      =    5114.31
        corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
        
        ------------------------------------------------------------------------------
             ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                 age |  -.0068148   .0006925    -9.84   0.000    -.0081721   -.0054575
             ttl_exp |   .0428326   .0010267    41.72   0.000     .0408202     .044845
               hours |   .0003067   .0002255     1.36   0.174    -.0001353    .0007487
               _cons |   1.597294    .018722    85.32   0.000     1.560599    1.633988
        -------------+----------------------------------------------------------------
             sigma_u |  .32309332
             sigma_e |  .29766067
                 rho |  .54090192   (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        
        . xtreg ln_wage age ttl_exp hours, mle nolog
        
        Random-effects ML regression                    Number of obs     =     28,443
        Group variable: idcode                          Number of groups  =      4,709
        
        Random effects u_i ~ Gaussian                   Obs per group:
                                                                      min =          1
                                                                      avg =        6.0
                                                                      max =         15
        
                                                        LR chi2(3)        =    4678.61
        Log likelihood  = -10500.075                    Prob > chi2       =     0.0000
        
        ------------------------------------------------------------------------------
             ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                 age |  -.0068087   .0006932    -9.82   0.000    -.0081672   -.0054501
             ttl_exp |   .0428009   .0010297    41.57   0.000     .0407827    .0448191
               hours |   .0003018   .0002256     1.34   0.181    -.0001404    .0007441
               _cons |   1.597463   .0187382    85.25   0.000     1.560737    1.634189
        -------------+----------------------------------------------------------------
            /sigma_u |   .3260675    .004153                      .3180286    .3343095
            /sigma_e |   .2984172   .0013726                       .295739    .3011197
                 rho |   .5441903   .0069048                      .5306341    .5576952
        ------------------------------------------------------------------------------
        LR test of sigma_u=0: chibar2(01) = 1.2e+04            Prob >= chibar2 = 0.000
        
        . replace ln_wage=. if runiform()>.9
        (2,866 real changes made, 2,866 to missing)
        
        . xtreg ln_wage age ttl_exp hours, re
        
        Random-effects GLS regression                   Number of obs     =     25,587
        Group variable: idcode                          Number of groups  =      4,645
        
        R-sq:                                           Obs per group:
             within  = 0.1400                                         min =          1
             between = 0.2565                                         avg =        5.5
             overall = 0.1816                                         max =         15
        
                                                        Wald chi2(3)      =    4715.98
        corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
        
        ------------------------------------------------------------------------------
             ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                 age |  -.0068057   .0007219    -9.43   0.000    -.0082206   -.0053908
             ttl_exp |   .0430682   .0010688    40.29   0.000     .0409733     .045163
               hours |   .0003284   .0002378     1.38   0.167    -.0001377    .0007944
               _cons |   1.594197   .0195405    81.58   0.000     1.555899    1.632496
        -------------+----------------------------------------------------------------
             sigma_u |  .32322814
             sigma_e |  .29678027
                 rho |  .54257979   (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        
        . xtreg ln_wage age ttl_exp hours, mle nolog
        
        Random-effects ML regression                    Number of obs     =     25,587
        Group variable: idcode                          Number of groups  =      4,645
        
        Random effects u_i ~ Gaussian                   Obs per group:
                                                                      min =          1
                                                                      avg =        5.5
                                                                      max =         15
        
                                                        LR chi2(3)        =    4307.46
        Log likelihood  = -9608.1982                    Prob > chi2       =     0.0000
        
        ------------------------------------------------------------------------------
             ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                 age |  -.0068011   .0007224    -9.41   0.000    -.0082171   -.0053851
             ttl_exp |   .0430403   .0010717    40.16   0.000     .0409397    .0451408
               hours |   .0003239   .0002379     1.36   0.173    -.0001424    .0007903
               _cons |   1.594377   .0195561    81.53   0.000     1.556048    1.632706
        -------------+----------------------------------------------------------------
            /sigma_u |   .3260184    .004234                      .3178246    .3344234
            /sigma_e |   .2975915   .0014573                      .2947488    .3004615
                 rho |   .5454899   .0071001                      .5315489    .5593751
        ------------------------------------------------------------------------------
        LR test of sigma_u=0: chibar2(01) = 1.0e+04            Prob >= chibar2 = 0.000
        Originally posted by paulvonhippel View Post
        Joro Kolev, you write: "There is no theorem to claim that MLE and GLS estimates have to be numerically the same. However they are both consistent under the same set of conditions."

        Is that right? Are they both consistent when there are missing values on the dependent variables? Where can I read more about this?

        Comment

        Working...
        X