Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Extraction of submatrix containing probit estimates

    I want to extract submatrices of the coefficient vector and the var/cov matrix which are returned following a probit estimation in which I have factor and interaction variables. The matrices e(b) and e(V) contain all results, including those relating to the base variables, which are all zero, of course. I want a vector and a matrix which contain only the coefficients which appear in the output table, ie those with column and row names which do not contain "*b." Is there an easy way to do this without having to explicitly name every row/column which I want to keep?

  • #2
    Look for esttab from Stata Journal, authored by Ben Jann.

    Code:
    . clear
    . webuse lbw
    (Hosmer & Lemeshow data)
    
    . probit low age i.race
    
    Iteration 0:   log likelihood =   -117.336  
    Iteration 1:   log likelihood = -114.03311  
    Iteration 2:   log likelihood = -114.02581  
    Iteration 3:   log likelihood = -114.02581  
    
    Probit regression                               Number of obs     =        189
                                                    LR chi2(3)        =       6.62
                                                    Prob > chi2       =     0.0850
    Log likelihood = -114.02581                     Pseudo R2         =     0.0282
    
    ------------------------------------------------------------------------------
             low |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             age |  -.0245291   .0195256    -1.26   0.209    -.0627986    .0137405
                 |
            race |
          black  |   .4543915   .2878814     1.58   0.114    -.1098456    1.018629
          other  |   .3427787   .2128901     1.61   0.107    -.0744783    .7600357
                 |
           _cons |  -.1199469   .4855903    -0.25   0.805    -1.071686    .8317925
    ------------------------------------------------------------------------------
    
    . mat list e(b)
    
    e(b)[1,5]
               low:        low:        low:        low:        low:
                            1b.          2.          3.            
               age        race        race        race       _cons
    y1  -.02452906           0   .45439151    .3427787  -.11994693
    
    . mat list e(V)
    
    symmetric e(V)[5,5]
                        low:        low:        low:        low:        low:
                                     1b.          2.          3.            
                        age        race        race        race       _cons
        low:age   .00038125
    low:1b.race           0           0
     low:2.race   .00087363           0   .08287568
     low:3.race   .00059942           0   .02124226    .0453222
      low:_cons   -.0090732           0  -.04065979  -.03413413   .23579792
    
    . qui esttab, nobaselevels
    
    . return list
    
    scalars:
                r(nmodels) =  1
                  r(ccols) =  3
    
    macros:
                  r(names) : "."
             r(m1_depname) : "low"
                r(cmdline) : "estout , cells(b(fmt(a3) star) t(fmt(2) par("{ralign @modelwidth:{txt:(}" "{txt:)}}"))) stats(.."
    
    matrices:
                  r(coefs) :  4 x 3
                  r(stats) :  1 x 1
    
    . mat list r(coefs)
    
    r(coefs)[4,3]
                    active:     active:     active:
                         b           t           p
       low:age  -.02452906    -1.25625   .20902535
    low:2.race   .45439151   1.5783985   .11447409
    low:3.race    .3427787   1.6101203   .10737159
     low:_cons  -.11994693  -.24701264   .80489844

    Comment


    • #3
      I do not see in the -esttab- return list the variance matrix...

      In any case, another way to go is to skip on the factor variable notation, and to generate the dummies manually.

      Code:
      .  webuse lbw, clear
      (Hosmer & Lemeshow data)
      
      . tab race, gen(rc)
      
             race |      Freq.     Percent        Cum.
      ------------+-----------------------------------
            white |         96       50.79       50.79
            black |         26       13.76       64.55
            other |         67       35.45      100.00
      ------------+-----------------------------------
            Total |        189      100.00
      
      . probit low age rc2 rc3, nolog
      
      Probit regression                               Number of obs     =        189
                                                      LR chi2(3)        =       6.62
                                                      Prob > chi2       =     0.0850
      Log likelihood = -114.02581                     Pseudo R2         =     0.0282
      
      ------------------------------------------------------------------------------
               low |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
               age |  -.0245291   .0195256    -1.26   0.209    -.0627986    .0137405
               rc2 |   .4543915   .2878814     1.58   0.114    -.1098456    1.018629
               rc3 |   .3427787   .2128901     1.61   0.107    -.0744783    .7600357
             _cons |  -.1199469   .4855903    -0.25   0.805    -1.071686    .8317925
      ------------------------------------------------------------------------------
      
      . matlist e(V)
      
                   | low                                        
                   |       age        rc2        rc3      _cons 
      -------------+--------------------------------------------
      low          |                                            
               age |  .0003812                                  
               rc2 |  .0008736   .0828757                       
               rc3 |  .0005994   .0212423   .0453222            
             _cons | -.0090732  -.0406598  -.0341341   .2357979 
      
      . matlist e(b)
      
                   | low                                        
                   |       age        rc2        rc3      _cons 
      -------------+--------------------------------------------
                y1 | -.0245291   .4543915   .3427787  -.1199469

      Comment


      • #4
        Many thanks Andrew, that's exactly what I need! (Joro, to use the 'margins' facility you need to tell Stata what type of variables you have using factor syntax in the regression equation. You can install esttab by typing search esttab in Stata and going to the link that comes up.)

        Comment


        • #5
          I do remember William Lisowski replied to a similar question to which I had contributed (see here). To omit the base levels, there is a way that you need to write out the probit command - for my example above:

          Code:
          . probit low age i.race
          
          Iteration 0:   log likelihood =   -117.336  
          Iteration 1:   log likelihood = -114.03311  
          Iteration 2:   log likelihood = -114.02581  
          Iteration 3:   log likelihood = -114.02581  
          
          Probit regression                               Number of obs     =        189
                                                          LR chi2(3)        =       6.62
                                                          Prob > chi2       =     0.0850
          Log likelihood = -114.02581                     Pseudo R2         =     0.0282
          
          ------------------------------------------------------------------------------
                   low |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                   age |  -.0245291   .0195256    -1.26   0.209    -.0627986    .0137405
                       |
                  race |
                black  |   .4543915   .2878814     1.58   0.114    -.1098456    1.018629
                other  |   .3427787   .2128901     1.61   0.107    -.0744783    .7600357
                       |
                 _cons |  -.1199469   .4855903    -0.25   0.805    -1.071686    .8317925
          ------------------------------------------------------------------------------
          
          . mat list e(V)
          
          symmetric e(V)[5,5]
                              low:        low:        low:        low:        low:
                                           1b.          2.          3.            
                              age        race        race        race       _cons
              low:age   .00038125
          low:1b.race           0           0
           low:2.race   .00087363           0   .08287568
           low:3.race   .00059942           0   .02124226    .0453222
            low:_cons   -.0090732           0  -.04065979  -.03413413   .23579792
          
          
          
          . probit low age 2bn.race 3.race
          
          Iteration 0:   log likelihood =   -117.336  
          Iteration 1:   log likelihood = -114.03311  
          Iteration 2:   log likelihood = -114.02581  
          Iteration 3:   log likelihood = -114.02581  
          
          Probit regression                               Number of obs     =        189
                                                          LR chi2(3)        =       6.62
                                                          Prob > chi2       =     0.0850
          Log likelihood = -114.02581                     Pseudo R2         =     0.0282
          
          ------------------------------------------------------------------------------
                   low |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                   age |  -.0245291   .0195256    -1.26   0.209    -.0627986    .0137405
                       |
                  race |
                black  |   .4543915   .2878814     1.58   0.114    -.1098456    1.018629
                other  |   .3427787   .2128901     1.61   0.107    -.0744783    .7600357
                       |
                 _cons |  -.1199469   .4855903    -0.25   0.805    -1.071686    .8317925
          ------------------------------------------------------------------------------
          
          . mat list e(V)
          
          symmetric e(V)[4,4]
                             low:        low:        low:        low:
                                           2.          3.            
                             age        race        race       _cons
             low:age   .00038125
          low:2.race   .00087363   .08287568
          low:3.race   .00059942   .02124226    .0453222
           low:_cons   -.0090732  -.04065979  -.03413413   .23579792


          Comment


          • #6
            Originally posted by Daniella Acker View Post
            Many thanks Andrew, that's exactly what I need! (Joro, to use the 'margins' facility you need to tell Stata what type of variables you have using factor syntax in the regression equation. You can install esttab by typing search esttab in Stata and going to the link that comes up.)
            A problem solved is a problems solved, so all that ends up well is well.

            However your initial question does not even mention the word "margins", here is your initial question "I want to extract submatrices of the coefficient vector and the var/cov matrix which are returned following a probit estimation in which I have factor and interaction variables. The matrices e(b) and e(V) contain all results, including those relating to the base variables, which are all zero, of course. I want a vector and a matrix which contain only the coefficients which appear in the output table, ie those with column and row names which do not contain "*b." Is there an easy way to do this without having to explicitly name every row/column which I want to keep?"

            -esttab- as good as it might be for producing LaTex and Word tables from estimates, does not deal with the variance matrix of the estimates. And you explicitly mentioned in your original post "I want to extract submatrices of the [..] the var/cov matrix"

            Comment

            Working...
            X