Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • -glm- vs -sem-: Point estimates are the same, but SEs are not

    Hello folks. My apologies if this has been asked before--I did search, but failed to find anything on it.

    When I estimate the same model via -glm- and -sem-, I get the same point estimates, but different SEs (see example below). Given that both procedures use MLE, I expected everything (including SEs) to be identical. Was I wrong to expect that? Are there any options with different defaults that I need to tinker with?

    Thanks for any insight you can offer.

    By the way, I am using v13.1 (for Windows).

    Cheers,
    Bruce


    Code:
    . sysuse auto, clear
    (1978 Automobile Data)
    
    .
    . glm mpg weight foreign        // estimate same model via -glm-
    
    Iteration 0:   log likelihood = -194.18306  
    
    Generalized linear models                          No. of obs      =        74
    Optimization     : ML                              Residual df     =        71
                                                       Scale parameter =  11.60805
    Deviance         =  824.1717613                    (1/df) Deviance =  11.60805
    Pearson          =  824.1717613                    (1/df) Pearson  =  11.60805
    
    Variance function: V(u) = 1                        [Gaussian]
    Link function    : g(u) = u                        [Identity]
    
                                                       AIC             =  5.329272
    Log likelihood   = -194.1830644                    BIC             =  518.5831
    
    ------------------------------------------------------------------------------
                 |                 OIM
             mpg |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          weight |  -.0065879   .0006371   -10.34   0.000    -.0078366   -.0053392
         foreign |  -1.650029   1.075994    -1.53   0.125    -3.758939    .4588806
           _cons |    41.6797   2.165547    19.25   0.000     37.43531     45.9241
    ------------------------------------------------------------------------------
    
    . sem (mpg <- weight foreign)   // estimate same model via -sem-
    
    Endogenous variables
    
    Observed:  mpg
    
    Exogenous variables
    
    Observed:  weight foreign
    
    Fitting target model:
    
    Iteration 0:   log likelihood =  -822.2459  
    Iteration 1:   log likelihood =  -822.2459  
    
    Structural equation model                       Number of obs      =        74
    Estimation method  = ml
    Log likelihood     =  -822.2459
    
    ------------------------------------------------------------------------------
                 |                 OIM
                 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    Structural   |
      mpg <-     |
          weight |  -.0065879   .0006241   -10.56   0.000     -.007811   -.0053647
         foreign |  -1.650029   1.053958    -1.57   0.117    -3.715748    .4156902
           _cons |    41.6797   2.121197    19.65   0.000     37.52223    45.83717
    -------------+----------------------------------------------------------------
       var(e.mpg)|   11.13746   1.830987                       8.06955    15.37173
    ------------------------------------------------------------------------------
    LR test of model vs. saturated: chi2(0)   =      0.00, Prob > chi2 =      .
    
    .
    . // The -glm- and -sem- commands both use MLE; and the
    . // defaults for -glm- are normal error distribution with
    . // identity link function.  So I expected -glm- and -sem-
    . // to yield identical results.  The point estimates are
    . // the same, but the SEs are not.
    --
    Bruce Weaver
    Email: [email protected]
    Version: Stata/MP 18.5 (Windows)

  • #2
    The sem manual says

    "When you estimate a linear regression by using sem, you obtain the same point estimates as you would with regress and the same standard errors up to a degree-of-freedom adjustment applied by regress."

    You can probably dig through the manual and get the exact formulas if you really really want them. But my guess is it is something like using N-k-1 vs using N-k.
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    StataNow Version: 19.5 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #3
      If you look at the header for the glm output, you'll see that the scale parameter is the same as the Deviance (sum squared residuals) divided by the degrees of freedom. If you instead use Deviance divided by N as the scale parameter, then the coefficient standard errors will be the same. (The coefficients are scaled by the square root of the scale parameter.)

      Try the following.
      Code:
      quietly glm mpg weight foreign // just to get the residual sum squares in the ereturn scalar
      glm mpg weight foreign, scale(`=e(deviance) / e(N)') nolog
      The scale parameter will now be the same as the residual variance coefficient fitted by sem (or mixed with an empty random-effects equation), and you'll now get the same regression coefficient standard errors as by sem (or mixed).
      Code:
      mixed mpg weight foreign, nolog

      Comment


      • #4
        The formula (if I can retype it correctly) is

        sem se = sqrt((N-k-1)/N) * regression se

        So in this case,

        Code:
        . di sqrt((74-2-1)/74) * .0006371
        .00062405
        
        . di sqrt((74-2-1)/74) * 1.075994
        1.0539577
        
        . di sqrt((74-2-1)/74) * 2.165547
        2.1211966
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        StataNow Version: 19.5 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          Thanks very much, Joseph & Richard. That certainly explains the difference I'm seeing. However, I'm still not clear on why the scale parameter is not computed using the same denominator for all three procedures (glm, sem, mixed). But that's a different question.

          Thanks again.
          Cheers,
          Bruce
          --
          Bruce Weaver
          Email: [email protected]
          Version: Stata/MP 18.5 (Windows)

          Comment

          Working...
          X