Extraction of submatrix containing probit estimates

Daniella Acker

Join Date: Jan 2019

Posts: 2
#1

Extraction of submatrix containing probit estimates

05 Jan 2019, 02:02

I want to extract submatrices of the coefficient vector and the var/cov matrix which are returned following a probit estimation in which I have factor and interaction variables. The matrices e(b) and e(V) contain all results, including those relating to the base variables, which are all zero, of course. I want a vector and a matrix which contain only the coefficients which appear in the output table, ie those with column and row names which do not contain "*b." Is there an easy way to do this without having to explicitly name every row/column which I want to keep?
Tags: None

Andrew Musau

Join Date: Oct 2014
Posts: 10219

05 Jan 2019, 04:31

Look for esttab from Stata Journal, authored by Ben Jann.

Code:

. clear
. webuse lbw
(Hosmer & Lemeshow data)

. probit low age i.race

Iteration 0:   log likelihood =   -117.336  
Iteration 1:   log likelihood = -114.03311  
Iteration 2:   log likelihood = -114.02581  
Iteration 3:   log likelihood = -114.02581  

Probit regression                               Number of obs     =        189
                                                LR chi2(3)        =       6.62
                                                Prob > chi2       =     0.0850
Log likelihood = -114.02581                     Pseudo R2         =     0.0282

------------------------------------------------------------------------------
         low |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |  -.0245291   .0195256    -1.26   0.209    -.0627986    .0137405
             |
        race |
      black  |   .4543915   .2878814     1.58   0.114    -.1098456    1.018629
      other  |   .3427787   .2128901     1.61   0.107    -.0744783    .7600357
             |
       _cons |  -.1199469   .4855903    -0.25   0.805    -1.071686    .8317925
------------------------------------------------------------------------------

. mat list e(b)

e(b)[1,5]
           low:        low:        low:        low:        low:
                        1b.          2.          3.            
           age        race        race        race       _cons
y1  -.02452906           0   .45439151    .3427787  -.11994693

. mat list e(V)

symmetric e(V)[5,5]
                    low:        low:        low:        low:        low:
                                 1b.          2.          3.            
                    age        race        race        race       _cons
    low:age   .00038125
low:1b.race           0           0
 low:2.race   .00087363           0   .08287568
 low:3.race   .00059942           0   .02124226    .0453222
  low:_cons   -.0090732           0  -.04065979  -.03413413   .23579792

. qui esttab, nobaselevels

. return list

scalars:
            r(nmodels) =  1
              r(ccols) =  3

macros:
              r(names) : "."
         r(m1_depname) : "low"
            r(cmdline) : "estout , cells(b(fmt(a3) star) t(fmt(2) par("{ralign @modelwidth:{txt:(}" "{txt:)}}"))) stats(.."

matrices:
              r(coefs) :  4 x 3
              r(stats) :  1 x 1

. mat list r(coefs)

r(coefs)[4,3]
                active:     active:     active:
                     b           t           p
   low:age  -.02452906    -1.25625   .20902535
low:2.race   .45439151   1.5783985   .11447409
low:3.race    .3427787   1.6101203   .10737159
 low:_cons  -.11994693  -.24701264   .80489844

Comment

Joro Kolev

Join Date: Aug 2018
Posts: 3050

05 Jan 2019, 04:56

I do not see in the -esttab- return list the variance matrix...

In any case, another way to go is to skip on the factor variable notation, and to generate the dummies manually.

Code:

.  webuse lbw, clear
(Hosmer & Lemeshow data)

. tab race, gen(rc)

       race |      Freq.     Percent        Cum.
------------+-----------------------------------
      white |         96       50.79       50.79
      black |         26       13.76       64.55
      other |         67       35.45      100.00
------------+-----------------------------------
      Total |        189      100.00

. probit low age rc2 rc3, nolog

Probit regression                               Number of obs     =        189
                                                LR chi2(3)        =       6.62
                                                Prob > chi2       =     0.0850
Log likelihood = -114.02581                     Pseudo R2         =     0.0282

------------------------------------------------------------------------------
         low |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |  -.0245291   .0195256    -1.26   0.209    -.0627986    .0137405
         rc2 |   .4543915   .2878814     1.58   0.114    -.1098456    1.018629
         rc3 |   .3427787   .2128901     1.61   0.107    -.0744783    .7600357
       _cons |  -.1199469   .4855903    -0.25   0.805    -1.071686    .8317925
------------------------------------------------------------------------------

. matlist e(V)

             | low                                        
             |       age        rc2        rc3      _cons 
-------------+--------------------------------------------
low          |                                            
         age |  .0003812                                  
         rc2 |  .0008736   .0828757                       
         rc3 |  .0005994   .0212423   .0453222            
       _cons | -.0090732  -.0406598  -.0341341   .2357979 

. matlist e(b)

             | low                                        
             |       age        rc2        rc3      _cons 
-------------+--------------------------------------------
          y1 | -.0245291   .4543915   .3427787  -.1199469

Comment

Daniella Acker

Join Date: Jan 2019

Posts: 2
#4

05 Jan 2019, 05:34

Many thanks Andrew, that's exactly what I need! (Joro, to use the 'margins' facility you need to tell Stata what type of variables you have using factor syntax in the regression equation. You can install esttab by typing search esttab in Stata and going to the link that comes up.)
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10219

05 Jan 2019, 06:06

I do remember William Lisowski replied to a similar question to which I had contributed (see here). To omit the base levels, there is a way that you need to write out the probit command - for my example above:

Code:

. probit low age i.race

Iteration 0:   log likelihood =   -117.336  
Iteration 1:   log likelihood = -114.03311  
Iteration 2:   log likelihood = -114.02581  
Iteration 3:   log likelihood = -114.02581  

Probit regression                               Number of obs     =        189
                                                LR chi2(3)        =       6.62
                                                Prob > chi2       =     0.0850
Log likelihood = -114.02581                     Pseudo R2         =     0.0282

------------------------------------------------------------------------------
         low |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |  -.0245291   .0195256    -1.26   0.209    -.0627986    .0137405
             |
        race |
      black  |   .4543915   .2878814     1.58   0.114    -.1098456    1.018629
      other  |   .3427787   .2128901     1.61   0.107    -.0744783    .7600357
             |
       _cons |  -.1199469   .4855903    -0.25   0.805    -1.071686    .8317925
------------------------------------------------------------------------------

. mat list e(V)

symmetric e(V)[5,5]
                    low:        low:        low:        low:        low:
                                 1b.          2.          3.            
                    age        race        race        race       _cons
    low:age   .00038125
low:1b.race           0           0
 low:2.race   .00087363           0   .08287568
 low:3.race   .00059942           0   .02124226    .0453222
  low:_cons   -.0090732           0  -.04065979  -.03413413   .23579792



. probit low age 2bn.race 3.race

Iteration 0:   log likelihood =   -117.336  
Iteration 1:   log likelihood = -114.03311  
Iteration 2:   log likelihood = -114.02581  
Iteration 3:   log likelihood = -114.02581  

Probit regression                               Number of obs     =        189
                                                LR chi2(3)        =       6.62
                                                Prob > chi2       =     0.0850
Log likelihood = -114.02581                     Pseudo R2         =     0.0282

------------------------------------------------------------------------------
         low |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |  -.0245291   .0195256    -1.26   0.209    -.0627986    .0137405
             |
        race |
      black  |   .4543915   .2878814     1.58   0.114    -.1098456    1.018629
      other  |   .3427787   .2128901     1.61   0.107    -.0744783    .7600357
             |
       _cons |  -.1199469   .4855903    -0.25   0.805    -1.071686    .8317925
------------------------------------------------------------------------------

. mat list e(V)

symmetric e(V)[4,4]
                   low:        low:        low:        low:
                                 2.          3.            
                   age        race        race       _cons
   low:age   .00038125
low:2.race   .00087363   .08287568
low:3.race   .00059942   .02124226    .0453222
 low:_cons   -.0090732  -.04065979  -.03413413   .23579792

Comment

Joro Kolev

Join Date: Aug 2018

Posts: 3050
#6

05 Jan 2019, 06:17

Originally posted by Daniella Acker View Post

Many thanks Andrew, that's exactly what I need! (Joro, to use the 'margins' facility you need to tell Stata what type of variables you have using factor syntax in the regression equation. You can install esttab by typing search esttab in Stata and going to the link that comes up.)

A problem solved is a problems solved, so all that ends up well is well.

However your initial question does not even mention the word "margins", here is your initial question "I want to extract submatrices of the coefficient vector and the var/cov matrix which are returned following a probit estimation in which I have factor and interaction variables. The matrices e(b) and e(V) contain all results, including those relating to the base variables, which are all zero, of course. I want a vector and a matrix which contain only the coefficients which appear in the output table, ie those with column and row names which do not contain "*b." Is there an easy way to do this without having to explicitly name every row/column which I want to keep?"

-esttab- as good as it might be for producing LaTex and Word tables from estimates, does not deal with the variance matrix of the estimates. And you explicitly mentioned in your original post "I want to extract submatrices of the [..] the var/cov matrix"
Comment

Announcement