Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • svy: ivregress not showing first stage

    So I'm having a weird issue with data I unfortunately can't post because it's confidential, and I can't recreate it with stata's built in data, so I'm hoping someone may be able to help here.

    Basically I have survey data, and I am trying to run an ivregress and get first stage results, but they're not printing for some reason.

    Here is the code and what it spits out:


    To Set Survey Data:
    Code:
    svyset [pweight=vallwt0],jkrw(vallwt1 -vallwt30) vce(jackknife) mse dof(29)
    Run a simple ivregress with no issues:
    Code:
    ivregress 2sls farmhhi (acres = assets), first
    
    First-stage regressions
    -----------------------
    
                                                         Number of obs =    82,931
                                                         F(1, 82929)   =  27848.60
                                                         Prob > F      =    0.0000
                                                         R-squared     =    0.2514
                                                         Adj R-squared =    0.2514
                                                         Root MSE      = 3504.7543
    
    ------------------------------------------------------------------------------
           acres | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
          assets |   .0002227   1.33e-06   166.88   0.000       .00022    .0002253
           _cons |   423.3562    12.9597    32.67   0.000     397.9552    448.7571
    ------------------------------------------------------------------------------
    
    
    Instrumental variables 2SLS regression            Number of obs   =     82,931
                                                      Wald chi2(1)    =    2667.63
                                                      Prob > chi2     =     0.0000
                                                      R-squared       =          .
                                                      Root MSE        =     1.0e+06
    
    ------------------------------------------------------------------------------
         farmhhi | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
           acres |   90.56129   1.753396    51.65   0.000      87.1247    93.99789
           _cons |    18538.1   4106.793     4.51   0.000     10488.94    26587.27
    ------------------------------------------------------------------------------
    Instrumented: acres
     Instruments: assets
    Run svy: Ivregress, but first stage doesn't appear:
    Code:
    . svy: ivregress 2sls farmhhi (acres = assets), first
    (running ivregress on estimation sample)
    
    Jackknife replications (30)
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
    ..............................
    
    Survey: Instrumental variables 2SLS regression
    
    Number of strata = 1                               Number of obs   =    82,931
                                                       Population size = 9,491,003
                                                       Replications    =        30
                                                       Design df       =        29
                                                       F(1, 29)        =     53.61
                                                       Prob > F        =    0.0000
    
    ------------------------------------------------------------------------------
                 |              Jknife *
         farmhhi | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
           acres |   64.70674   8.837549     7.32   0.000     46.63193    82.78156
           _cons |  -454.7469   3521.358    -0.13   0.898    -7656.732    6747.238
    ------------------------------------------------------------------------------
    Instrumented: acres
     Instruments: assets
    So there's a suggestion that it's instrumented, and the IV results look different than the ols results (though not shown here), so I assume something is happening under the hood.

    Anyone have any idea why this would happen? Is the svyset not working correctly? Is there any other way to recover the first stage estimates?

  • #2
    What version of Stata do you have? I cannot reproduce this in version 16. In any case, you can replicate the first stage using svy: regress.

    Code:
    webuse nhanes2d, clear
    svyset
    svy: ivregress 2sls highbp (weight= height age) female, first
    svy: regress weight female height age if e(sample)
    Res.:

    Code:
    . svy: ivregress 2sls highbp (weight= height age) female, first
    (running ivregress on estimation sample)
    
    First-stage regression
    -----------------------
    (running regress on estimation sample)
    
    Survey: Linear regression
    
    Number of strata   =        31                Number of obs     =       10,351
    Number of PSUs     =        62                Population size   =  117,157,513
                                                  Design df         =           31
                                                  F(   3,     29)   =      1177.18
                                                  Prob > F          =       0.0000
                                                  R-squared         =       0.2827
    
    ------------------------------------------------------------------------------
                 |             Linearized
          weight |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          female |  -2.898197   .5888597    -4.92   0.000    -4.099184   -1.697209
          height |   .7405073    .027744    26.69   0.000     .6839229    .7970917
             age |   .1484546   .0116501    12.74   0.000      .124694    .1722153
           _cons |   -57.6088   4.955696   -11.62   0.000      -67.716   -47.50159
    ------------------------------------------------------------------------------
    
    
    Survey: Instrumental variables (2SLS) regression
    
    Number of strata   =        31                Number of obs     =       10,351
    Number of PSUs     =        62                Population size   =  117,157,513
                                                  Design df         =           31
                                                  F(   2,     30)   =        57.60
                                                  Prob > F          =       0.0000
                                                  R-squared         =       0.0892
    
    ------------------------------------------------------------------------------
                 |             Linearized
          highbp |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          weight |   .0079069   .0011775     6.71   0.000     .0055053    .0103085
          female |  -.0112502   .0168215    -0.67   0.509    -.0455579    .0230576
           _cons |  -.1941135   .0912216    -2.13   0.041    -.3801611   -.0080659
    ------------------------------------------------------------------------------
    Instrumented:  weight
    Instruments:   female height age
    ------------------------------------------------------------------------------
    
    . 
    . svy: regress weight female height age if e(sample)
    (running regress on estimation sample)
    
    Survey: Linear regression
    
    Number of strata   =        31                Number of obs     =       10,351
    Number of PSUs     =        62                Population size   =  117,157,513
                                                  Design df         =           31
                                                  F(   3,     29)   =      1177.18
                                                  Prob > F          =       0.0000
                                                  R-squared         =       0.2827
    
    ------------------------------------------------------------------------------
                 |             Linearized
          weight |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          female |  -2.898197   .5888597    -4.92   0.000    -4.099184   -1.697209
          height |   .7405073    .027744    26.69   0.000     .6839229    .7970917
             age |   .1484546   .0116501    12.74   0.000      .124694    .1722153
           _cons |   -57.6088   4.955696   -11.62   0.000      -67.716   -47.50159
    ------------------------------------------------------------------------------
    
    . 

    Comment


    • #3
      Hi Andrew Musau. Thanks for your response. I'm using stata 17. I also can't recreate with your example. It only seems to be happening with my dataset. Is there a reason why my svyset may cause issues or why this may be happening with the size of the sample or something else?

      Comment


      • #4
        I can reproduce the issue now. This appears to be with the inclusion of -vce(jackknife)- in the svyset command. Nonetheless, I still recommend the workaround in #2.

        Code:
        *THIS REPLICATES THE ISSUE
        webuse stage5a_jkw, clear
        svyset [pweight=pw], jkrweight(jkw_*) vce(jackknife)
        svy: ivregress 2sls x1 (x2= x3), first
        
        *WORKAROUND
        svy: regress x2 x3 if e(sample)
        
        *THIS WORKS
        webuse stage5a_jkw, clear
        svyset [pweight=pw], jkrweight(jkw_*)
        svy: ivregress 2sls x1 (x2= x3), first
        
        *CHECK
        svy: regress x2 x3 if e(sample)
        Res.:

        Code:
        . *THIS REPLICATES THE ISSUE
        
        . 
        . webuse stage5a_jkw, clear
        
        . 
        . svyset [pweight=pw], jkrweight(jkw_*) vce(jackknife)
        
              pweight: pw
                  VCE: jackknife
                  MSE: off
            jkrweight: jkw_1 .. jkw_9
          Single unit: missing
             Strata 1: <one>
                 SU 1: <observations>
                FPC 1: <zero>
        
        . 
        . svy: ivregress 2sls x1 (x2= x3), first
        (running ivregress on estimation sample)
        
        Jackknife replications (9)
        ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
        .........
        
        Survey: Instrumental variables (2SLS) regression
        
        Number of strata   =         3                  Number of obs     =     11,039
                                                        Population size   = 529,810.54
                                                        Replications      =          9
                                                        Design df         =          6
                                                        F(   1,      6)   =       0.15
                                                        Prob > F          =     0.7104
        
        ------------------------------------------------------------------------------
                     |              Jackknife
                  x1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                  x2 |   3.959429   10.16687     0.39   0.710      -20.918    28.83686
               _cons |   -.964963     2.4496    -0.39   0.707    -6.958918    5.028992
        ------------------------------------------------------------------------------
        Instrumented:  x2
        Instruments:   x3
        ------------------------------------------------------------------------------
        
        . 
        . 
        . 
        . *WORKAROUND
        
        . 
        . svy: regress x2 x3 if e(sample)
        (running regress on estimation sample)
        
        Jackknife replications (9)
        ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
        .........
        
        Survey: Linear regression
        
        Number of strata   =         3                  Number of obs     =     11,039
                                                        Population size   = 529,810.54
                                                        Replications      =          9
                                                        Design df         =          6
                                                        F(   1,      6)   =       0.00
                                                        Prob > F          =     0.9480
                                                        R-squared         =     0.0000
        
        ------------------------------------------------------------------------------
                     |              Jackknife
                  x2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                  x3 |   -.000675   .0099211    -0.07   0.948    -.0249511    .0236011
               _cons |   .2414442    .040907     5.90   0.001     .1413483    .3415401
        ------------------------------------------------------------------------------
        
        . 
        . 
        . 
        . *THIS WORKS
        
        . 
        . webuse stage5a_jkw, clear
        
        . 
        . svyset [pweight=pw], jkrweight(jkw_*)
        
              pweight: pw
                  VCE: linearized
            jkrweight: jkw_1 .. jkw_9
          Single unit: missing
             Strata 1: <one>
                 SU 1: <observations>
                FPC 1: <zero>
        
        . 
        . svy: ivregress 2sls x1 (x2= x3), first
        (running ivregress on estimation sample)
        
        First-stage regression
        -----------------------
        (running regress on estimation sample)
        
        Survey: Linear regression
        
        Number of strata   =         1                  Number of obs     =     11,039
        Number of PSUs     =    11,039                  Population size   = 529,810.54
                                                        Design df         =     11,038
                                                        F(   1,  11038)   =       0.00
                                                        Prob > F          =     0.9462
                                                        R-squared         =     0.0000
        
        ------------------------------------------------------------------------------
                     |             Linearized
                  x2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                  x3 |   -.000675   .0100034    -0.07   0.946    -.0202835    .0189335
               _cons |   .2414442   .0071964    33.55   0.000     .2273381    .2555504
        ------------------------------------------------------------------------------
        
        
        Survey: Instrumental variables (2SLS) regression
        
        Number of strata   =         1                  Number of obs     =     11,039
        Number of PSUs     =    11,039                  Population size   = 529,810.54
                                                        Design df         =     11,038
                                                        F(   1,  11038)   =       0.00
                                                        Prob > F          =     0.9521
        
        ------------------------------------------------------------------------------
                     |             Linearized
                  x1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                  x2 |   3.959429   65.85887     0.06   0.952    -125.1357    133.0546
               _cons |   -.964963   15.87823    -0.06   0.952    -32.08914    30.15921
        ------------------------------------------------------------------------------
        Instrumented:  x2
        Instruments:   x3
        ------------------------------------------------------------------------------
        
        . 
        . 
        . 
        . *CHECK
        
        . 
        . svy: regress x2 x3 if e(sample)
        (running regress on estimation sample)
        
        Survey: Linear regression
        
        Number of strata   =         1                  Number of obs     =     11,039
        Number of PSUs     =    11,039                  Population size   = 529,810.54
                                                        Design df         =     11,038
                                                        F(   1,  11038)   =       0.00
                                                        Prob > F          =     0.9462
                                                        R-squared         =     0.0000
        
        ------------------------------------------------------------------------------
                     |             Linearized
                  x2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                  x3 |   -.000675   .0100034    -0.07   0.946    -.0202835    .0189335
               _cons |   .2414442   .0071964    33.55   0.000     .2273381    .2555504
        ------------------------------------------------------------------------------
        
        .
        Last edited by Andrew Musau; 05 May 2022, 16:03.

        Comment


        • #5
          Thanks for your response. Weird that the jackknife messes this up! Seems the workaround is the best I can do for now, and maybe it's something that Stata needs to address in a future update!

          Comment


          • #6
            Originally posted by justin winikoff View Post
            Thanks for your response. Weird that the jackknife messes this up! Seems the workaround is the best I can do for now, and maybe it's something that Stata needs to address in a future update!
            Originally posted by justin winikoff View Post
            Thanks for your response. Weird that the jackknife messes this up! Seems the workaround is the best I can do for now, and maybe it's something that Stata needs to address in a future update!
            I do not see this as a problem because the first stage can always be done the way how Andrew showed.

            But you might want to write to Stata Technical support to hear their opinion on the problem, just in case that something more sinister is going on behind scenes.

            Not only Jack Knife, also Bootstrap causes the first stage not to be displayed even when requested. So apparently all resampling methods of variance cause this problem. See below

            Code:
            . use stage5a_jkw, clear
            
            . svyset [pweight=pw], vce(linearized)
            
            Sampling weights: pw
                         VCE: linearized
                 Single unit: missing
                    Strata 1: <one>
             Sampling unit 1: <observations>
                       FPC 1: <zero>
            
            . svy: ivregress 2sls x1 (x2= x3), first
            (running ivregress on estimation sample)
            
            First-stage regression
            -----------------------
            (running regress on estimation sample)
            
            Survey: Linear regression
            
            Number of strata =      1                         Number of obs   =     11,039
            Number of PSUs   = 11,039                         Population size = 529,810.54
                                                              Design df       =     11,038
                                                              F(1, 11038)     =       0.00
                                                              Prob > F        =     0.9462
                                                              R-squared       =     0.0000
            
            ------------------------------------------------------------------------------
                         |             Linearized
                      x2 | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
            -------------+----------------------------------------------------------------
                      x3 |   -.000675   .0100034    -0.07   0.946    -.0202835    .0189335
                   _cons |   .2414442   .0071964    33.55   0.000     .2273381    .2555504
            ------------------------------------------------------------------------------
            
            
            Survey: Instrumental variables 2SLS regression
            
            Number of strata =      1                         Number of obs   =     11,039
            Number of PSUs   = 11,039                         Population size = 529,810.54
                                                              Design df       =     11,038
                                                              F(1, 11038)     =       0.00
                                                              Prob > F        =     0.9521
            
            ------------------------------------------------------------------------------
                         |             Linearized
                      x1 | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
            -------------+----------------------------------------------------------------
                      x2 |   3.959429   65.85887     0.06   0.952    -125.1357    133.0546
                   _cons |   -.964963   15.87823    -0.06   0.952    -32.08914    30.15921
            ------------------------------------------------------------------------------
            Instrumented: x2
             Instruments: x3
            
            . svyset [pweight=pw], bsrweight(jkw_*) vce(bootstrap)
            
             Sampling weights: pw
                          VCE: bootstrap
                          MSE: off
            Bootstrap weights: jkw_1 .. jkw_9
                  Single unit: missing
                     Strata 1: <one>
              Sampling unit 1: <observations>
                        FPC 1: <zero>
            
            . svy: ivregress 2sls x1 (x2= x3), first
            (running ivregress on estimation sample)
            
            Bootstrap replications (9)
            ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
            .........
            
            Survey: Instrumental variables 2SLS regression    Number of obs   =     11,039
                                                              Population size = 529,810.54
                                                              Replications    =          9
                                                              Wald chi2(1)    =       0.67
                                                              Prob > chi2     =     0.4119
            
            ------------------------------------------------------------------------------
                         |   Observed   Bootstrap                         Normal-based
                      x1 | coefficient  std. err.      z    P>|z|     [95% conf. interval]
            -------------+----------------------------------------------------------------
                      x2 |   3.959429   4.824865     0.82   0.412    -5.497133    13.41599
                   _cons |   -.964963   1.142866    -0.84   0.398    -3.204938    1.275012
            ------------------------------------------------------------------------------
            Instrumented: x2
             Instruments: x3
            
            .

            Comment


            • #7
              Does adding the noisily option address the problem?

              Code:
              webuse stage5a_jkw, clear
              svyset [pweight=pw], jkrweight(jkw_*) vce(jackknife)
              svy, noisily: ivregress 2sls x1 (x2= x3), first
              -------------------------------------------
              Richard Williams, Notre Dame Dept of Sociology
              StataNow Version: 19.5 MP (2 processor)

              EMAIL: [email protected]
              WWW: https://academicweb.nd.edu/~rwilliam/

              Comment


              • #8
                Originally posted by Richard Williams View Post
                Does adding the noisily option address the problem?

                Code:
                webuse stage5a_jkw, clear
                svyset [pweight=pw], jkrweight(jkw_*) vce(jackknife)
                svy, noisily: ivregress 2sls x1 (x2= x3), first
                A few days late, but unfortunately this did not work. It presented all of the jacknife replications (along with the first stage), but for the final estimated result, only the second stage.

                Comment

                Working...
                X