Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xtivreg, re first

    Hello.

    I have a question about the first stage regression in xtivreg, re. I assume it should possible to replicate it using xtreg, re, but I could not get the same results if I add the option first in xtivreg, re. For instance, using Stata's example:
    Code:
    . use https://www.stata-press.com/data/r16/nlswork
    (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
    
    . xtivreg ln_w age c.age#c.age not_smsa 2.race (tenure = union birth south), re first
    
    First-stage G2SLS regression
                                                     Number of obs    =     19,007
                                                     Wald chi(7)      =       5185
                                                     Prob > chi2      =     0.0000
    
    ------------------------------------------------------------------------------
          tenure |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             age |   .0859978   .0348599     2.47   0.014     .0176737    .1543219
                 |
     c.age#c.age |   .0032321   .0005594     5.78   0.000     .0021356    .0043285
                 |
        not_smsa |  -.0141086   .0736087    -0.19   0.848    -.1583791    .1301618
                 |
            race |
          black  |    .316991   .0831899     3.81   0.000     .1539419    .4800402
           union |   .9664885   .0645912    14.96   0.000     .8398921    1.093085
        birth_yr |   .1426261   .0122313    11.66   0.000     .1186531     .166599
           south |  -.2031909   .0712932    -2.85   0.004    -.3429231   -.0634587
           _cons |  -9.546047   .7705592   -12.39   0.000    -11.05632   -8.035779
    ------------------------------------------------------------------------------
    
    G2SLS random-effects IV regression              Number of obs     =     19,007
    Group variable: idcode                          Number of groups  =      4,134
    
    R-sq:                                           Obs per group:
         within  = 0.0664                                         min =          1
         between = 0.2098                                         avg =        4.6
         overall = 0.1463                                         max =         12
    
                                                    Wald chi2(5)      =    1446.37
    corr(u_i, X)       = 0 (assumed)                Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
         ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          tenure |   .1391798   .0078756    17.67   0.000      .123744    .1546157
             age |   .0279649   .0054182     5.16   0.000     .0173454    .0385843
                 |
     c.age#c.age |  -.0008357   .0000871    -9.60   0.000    -.0010063    -.000665
                 |
        not_smsa |  -.2235103   .0111371   -20.07   0.000    -.2453386   -.2016821
                 |
            race |
          black  |  -.2078613   .0125803   -16.52   0.000    -.2325183   -.1832044
           _cons |   1.337684   .0844988    15.83   0.000     1.172069    1.503299
    -------------+----------------------------------------------------------------
         sigma_u |  .36582493
         sigma_e |  .63031479
             rho |  .25197078   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    Instrumented:   tenure
    Instruments:    age c.age#c.age not_smsa 2.race union birth_yr south
    ------------------------------------------------------------------------------
    However, if I estimate the first stage manually using xtreg, re, it does not replicate the first stage in xtivreg, re:
    Code:
    . xtreg tenure age c.age#c.age not_smsa 2.race union birth south, re
    
    Random-effects GLS regression                   Number of obs     =     19,007
    Group variable: idcode                          Number of groups  =      4,134
    
    R-sq:                                           Obs per group:
         within  = 0.3001                                         min =          1
         between = 0.0677                                         avg =        4.6
         overall = 0.1409                                         max =         12
    
                                                    Wald chi2(7)      =    6261.51
    corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
          tenure |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             age |   .0798705   .0327708     2.44   0.015     .0156408    .1441001
                 |
     c.age#c.age |   .0035999   .0005253     6.85   0.000     .0025704    .0046295
                 |
        not_smsa |  -.1027399   .0828931    -1.24   0.215    -.2652074    .0597276
                 |
            race |
          black  |   .3394806   .1040601     3.26   0.001     .1355266    .5434347
           union |   .7374317    .063774    11.56   0.000      .612437    .8624265
        birth_yr |   .1611627   .0150945    10.68   0.000     .1315781    .1907473
           south |    -.23614    .081409    -2.90   0.004    -.3956987   -.0765813
           _cons |  -10.66948   .8674532   -12.30   0.000    -12.36966   -8.969308
    -------------+----------------------------------------------------------------
         sigma_u |  2.4434475
         sigma_e |  2.5869529
             rho |  .47149556   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    Yet, the replication is possible with the fixed-effects models:
    Code:
    . xtivreg ln_w age c.age#c.age not_smsa 2.race (tenure = union birth south), fe first
    
    First-stage within regression
    
    Fixed-effects (within) regression               Number of obs     =     19,007
    Group variable: idcode                          Number of groups  =      4,134
    
    R-sq:                                           Obs per group:
         within  = 0.3019                                         min =          1
         between = 0.0578                                         avg =        4.6
         overall = 0.1289                                         max =         12
    
                                                    F(5,14868)        =    1285.83
    corr(u_i, Xb)  = -0.1871                        Prob > F          =     0.0000
    
    ------------------------------------------------------------------------------
          tenure |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             age |   .0863031   .0343238     2.51   0.012     .0190241    .1535821
                 |
     c.age#c.age |   .0039115    .000549     7.12   0.000     .0028353    .0049876
                 |
        not_smsa |  -.3552488   .1269318    -2.80   0.005    -.6040507   -.1064468
                 |
            race |
          black  |          0  (omitted)
           union |   .3896861   .0706568     5.52   0.000     .2511901    .5281821
        birth_yr |          0  (omitted)
           south |  -.4296172   .1349122    -3.18   0.001    -.6940618   -.1651726
           _cons |  -2.554764   .5293374    -4.83   0.000     -3.59233   -1.517197
    -------------+----------------------------------------------------------------
         sigma_u |  3.1334845
         sigma_e |  2.5869529
             rho |  .59467598   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    F test that all u_i=0: F(4133, 14868) = 6.40                 Prob > F = 0.0000
    
    Fixed-effects (within) IV regression            Number of obs     =     19,007
    Group variable: idcode                          Number of groups  =      4,134
    
    R-sq:                                           Obs per group:
         within  =      .                                         min =          1
         between = 0.1304                                         avg =        4.6
         overall = 0.0897                                         max =         12
    
                                                    Wald chi2(4)      =  147926.58
    corr(u_i, Xb)  = -0.6843                        Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
         ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          tenure |   .2403531   .0373419     6.44   0.000     .1671643    .3135419
             age |   .0118437   .0090032     1.32   0.188    -.0058023    .0294897
                 |
     c.age#c.age |  -.0012145   .0001968    -6.17   0.000    -.0016003   -.0008286
                 |
        not_smsa |  -.0167178   .0339236    -0.49   0.622    -.0832069    .0497713
                 |
            race |
          black  |          0  (omitted)
           _cons |   1.678287   .1626657    10.32   0.000     1.359468    1.997106
    -------------+----------------------------------------------------------------
         sigma_u |  .70661941
         sigma_e |  .63029359
             rho |  .55690561   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    F  test that all u_i=0:     F(4133,14869) =     1.36      Prob > F    = 0.0000
    ------------------------------------------------------------------------------
    Instrumented:   tenure
    Instruments:    age c.age#c.age not_smsa 2.race union birth_yr south
    ------------------------------------------------------------------------------
    
    . xtreg tenure age c.age#c.age not_smsa 2.race union birth south, fe
    note: 2.race omitted because of collinearity
    note: birth_yr omitted because of collinearity
    
    Fixed-effects (within) regression               Number of obs     =     19,007
    Group variable: idcode                          Number of groups  =      4,134
    
    R-sq:                                           Obs per group:
         within  = 0.3019                                         min =          1
         between = 0.0578                                         avg =        4.6
         overall = 0.1289                                         max =         12
    
                                                    F(5,14868)        =    1285.83
    corr(u_i, Xb)  = -0.1871                        Prob > F          =     0.0000
    
    ------------------------------------------------------------------------------
          tenure |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             age |   .0863031   .0343238     2.51   0.012     .0190241    .1535821
                 |
     c.age#c.age |   .0039115    .000549     7.12   0.000     .0028353    .0049876
                 |
        not_smsa |  -.3552488   .1269318    -2.80   0.005    -.6040507   -.1064468
                 |
            race |
          black  |          0  (omitted)
           union |   .3896861   .0706568     5.52   0.000     .2511901    .5281821
        birth_yr |          0  (omitted)
           south |  -.4296172   .1349122    -3.18   0.001    -.6940618   -.1651726
           _cons |  -2.554764   .5293374    -4.83   0.000     -3.59233   -1.517197
    -------------+----------------------------------------------------------------
         sigma_u |  3.1334845
         sigma_e |  2.5869529
             rho |  .59467598   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    F test that all u_i=0: F(4133, 14868) = 6.40                 Prob > F = 0.0000
    Could you please tell what exactly is going on in the first stage of xtivreg, re and is it possible replicate it somehow?

  • #2
    Read the xtivreg manual on how the first stage is estimated. You need to construct GLS instruments from the exogenous and instrumental variables. Also, contrast G2SLS with Baltagi's EC2SLS which uses a different method to construct the GLS instruments.

    Code:
    xtivreg ln_w age c.age#c.age not_smsa 2.race (tenure = union birth south), re first ec2sls

    Comment


    • #3
      Originally posted by Andrew Musau View Post
      Read the xtivreg manual on how the first stage is estimated. You need to construct GLS instruments from the exogenous and instrumental variables. Also, contrast G2SLS with Baltagi's EC2SLS which uses a different method to construct the GLS instruments.

      Code:
      xtivreg ln_w age c.age#c.age not_smsa 2.race (tenure = union birth south), re first ec2sls
      Thank you, Andrew. It is clearer now. I see that the GSL transform is equal to x(it)-theta(it)*xbar(i). I wonder if there is a Stata syntax that allows making obtaining transform variables to be able to manually derive the first-stage in xtivreg, re?

      Comment


      • #4
        You should be able to see how Stata does it by looking at

        Code:
        viewsource xtivreg.ado

        But, you need to be able to filter out all the other code.

        Comment


        • #5
          Thank you. A related question. Suppose I would be willing to use the control function approach (e.g., due to the presence of the interaction terms involving the endogenous variable) instead of 2SLS (e.g., as in Jeffrey Wooldridge's 2015 paper in the Journal of Human Resources). Would it make sense to estimate the first stage regression using pooled OLS and then include the residuals in the second stage?
          Code:
          reg tenure age c.age#c.age not_smsa 2.race union birth south
          predict cf_tenure, res
          xtreg ln_w age c.age#c.age not_smsa 2.race tenure c.tenure#c.not_smsa cf_tenure, re
          Or should in such case the first stage also be a random-effect estimation?
          Code:
          xtreg tenure age c.age#c.age not_smsa 2.race union birth south, re
          predict cf_tenure, e
          xtreg ln_w age c.age#c.age not_smsa 2.race tenure c.tenure#c.not_smsa cf_tenure, re
          I know that the standard errors need to be bootstrapped, but I would like to know what would be a more correct approach conceptually.

          Comment


          • #6
            The Wooldridge JHR article dealt with cross-sectional models. I have not seen an application of what you are asking implemented and I am no expert on this, so I would start a new thread and ask about that specifically. However, if you have both an endogenous regressor and want to account for heterogeneity in panel data, then xtivreg is the right command. It appears you want to replicate what the command does in steps for some reason.

            Comment

            Working...
            X