Return a Scalar and a Matrix from an eclass() Program

Roberto Liebscher

Join Date: Mar 2014

Posts: 92
#1

Return a Scalar and a Matrix from an eclass() Program

29 Sep 2016, 05:13

Dear Statalisters,

I would like to write an eclass program which is to be passed to the bootstrap command. In this program I would like to estimate a model followed by some tests which results I would like to store in addition to the coefficient vector. To give an example:

Code:

sysuse auto, clear replace price = . in 10 //to check whether correct number of obs is used in bootstrap capture program drop myboot program myboot, eclass tempname b newscal regress price mpg headroom i.rep78 matrix `b' = e(b) lincom mpg-headroom ereturn scalar `newscal' = r(se) ereturn post `b' end bootstrap, rep(1000) seed(342): myboot

If I type ereturn list I cannot find the scalar. Could somebody please tell me what went wrong here?
Tags: bootstrap, eclass, program
Nick Cox

Join Date: Mar 2014

Posts: 35642
#2

29 Sep 2016, 05:16

Don't use a temporary name, as that dooms the object to transience.
1 like
Comment

Roberto Liebscher

Join Date: Mar 2014
Posts: 92

29 Sep 2016, 05:22

Thanks Nick Cox for this hint. However, when I try the code without setting temporary names it still does not return the scalar. Maybe there is something to add to the bootstrap command?

Code:

sysuse auto, clear

replace price = . in 10 //to check whether correct number of obs is used in bootstrap

capture program drop myboot
program myboot, eclass
    regress price mpg headroom i.rep78
    matrix b = e(b)
    lincom mpg-headroom
    ereturn scalar newscal = r(se)
    ereturn post b
end

bootstrap, rep(1000) seed(342): myboot

Comment

Jesse Wursten

Join Date: Jan 2016
Posts: 915

29 Sep 2016, 05:37

Have you tried doing the ereturn post before the ereturn scalar? That seems to work for me. From help ereturn: ereturn post clears all existing e-class results and stores the coefficient [...]

Code:

sysuse auto, clear

replace price = . in 10 //to check whether correct number of obs is used in bootstrap

capture program drop myboot
program myboot, eclass
    regress price mpg headroom i.rep78
    matrix b = e(b)
    lincom mpg-headroom
    ereturn post b
    ereturn scalar newscal = r(se)
end

myboot
ereturn list

Code:

. ereturn list

scalars:
            e(newscal) =  410.6772154933516

macros:
         e(properties) : "b"

matrices:
                  e(b) :  1 x 8

Last edited by Jesse Wursten; 29 Sep 2016, 05:39.

Comment

Roberto Liebscher

Join Date: Mar 2014
Posts: 92

29 Sep 2016, 06:30

Jesse Wursten Great idea, this works. Do you have an explanation for this behavior? Also, is there a way to validate that the results saved in newscal is the average of all the single estimations and not just the results of the estimation with the last bootstrap sample? I am asking, because when I look into the stored results in the statafile tmpfile I cannot see the single results.

Code:

sysuse auto, clear

replace price = . in 10 //to check whether correct number of obs is used in bootstrap

capture program drop myboot
program myboot, eclass
    regress price mpg headroom
    matrix b = e(b)
    scalar obs = e(N)
    lincom mpg-headroom
    ereturn post b
    ereturn scalar newscal = r(se)
    ereturn scalar obs = obs
end

bootstrap, rep(1000) seed(342) saving(tmpfile.dta, replace): myboot

Here is what I see in tmpfile:

Code:

  |    _b_mpg   _b_head~m    _b_cons |
  |----------------------------------|
  | -500.1654   -486.6479   18439.41 |
  | -273.0997    -704.849   13719.69 |
  | -162.5183    354.7137   8494.055 |
  | -296.1175   -382.2571   13435.47 |
  |  -287.231   -113.3691    12877.4 |
  |----------------------------------|
  | -383.2015   -792.4132   17233.15 |
  | -218.9065    27.17953   10390.05 |
  | -142.5916   -90.89295   9484.384 |
  | -251.7405    58.67559   11886.79 |
  | -284.7032   -322.3931   13167.93 |

Comment

Jesse Wursten

Join Date: Jan 2016

Posts: 915
#6

29 Sep 2016, 07:34

ereturn post clears everything in e(), so the scalar would be removed if you do the post part after.

As for the bootstrap, I have to admit I have no idea as I never work with the command. You could specify the "noisily" option and check manually with just a handful of reps?
Comment
Roberto Liebscher

Join Date: Mar 2014

Posts: 92
#7

29 Sep 2016, 07:40

Thanks for clarification, Jesse. After some attempts I finally found something (maybe ugly) that is working. I simply add my other estimation results to the coefficient vector. The drawback of my approach is that I have to drop missing observations before the bootstrap starts, something I was hoping Stata takes automatically care of (because it is an eclass command). But I can live with that for the moment. If someone has a nicer approach I would be happy to hear.

Code:

sysuse auto, clear replace price = . in 10 //to check whether correct number of obs is used in bootstrap egen rowmiss = rowmiss(price mpg headroom) keep if rowmiss == 0 capture program drop myboot program myboot, eclass regress price mpg headroom matrix b = e(b) scalar obs = e(N) lincom mpg-headroom matrix addmat = [obs, r(se)] matrix b = [b, addmat] ereturn post b end bootstrap, rep(1000) seed(342) saving(tmpfile.dta, replace): myboot
Comment

Jesse Wursten

Join Date: Jan 2016
Posts: 915

29 Sep 2016, 09:56

You can manually specify what the bootstrap command should work on.

Code:

bootstrap mpg = _b[mpg] headroom = _b[headroom] lincom_se = e(newscal), noisily rep(5) seed(342): myboot

Code:

      command:  myboot
          mpg:  _b[mpg]
     headroom:  _b[headroom]
    lincom_se:  e(newscal)

------------------------------------------------------------------------------
             |   Observed   Bootstrap                         Normal-based
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |  -260.1958   72.37625    -3.60   0.000    -402.0506   -118.3409
    headroom |  -312.3023   332.5286    -0.94   0.348    -964.0464    339.4418
   lincom_se |   380.0559   45.83123     8.29   0.000     290.2284    469.8835
------------------------------------------------------------------------------

As for missing observations, note the warning bootstrap gives you. As in, it explicitly doesn't drop the observations containing some missings.

Code:

Warning:  Because myboot is not an estimation command or does not set e(sample), bootstrap has no way to determine
          which observations are used in calculating the statistics and so assumes that all observations are used.
          This means that no observations will be excluded from the resampling because of missing values or other
          reasons.

This makes sense, as the bootstrap doesn't know which variables are relevant. Or in some cases, different variables are relevant for different parts of the command. It is straightforward to drop them yourself though.

Code:

drop if missing(price, mpg, headroom)

Comment

Roberto Liebscher

Join Date: Mar 2014

Posts: 92
#9

30 Sep 2016, 07:12

Thanks Jesse Wursten. The only problem I see with this approach is that the scalars will not be saved when using the saving() option.
Comment
Jesse Wursten

Join Date: Jan 2016

Posts: 915
#10

30 Sep 2016, 09:02

Originally posted by Roberto Liebscher View Post

Thanks Jesse Wursten. The only problem I see with this approach is that the scalars will not be saved when using the saving() option.

I just tried and they are saved?
Comment

Roberto Liebscher

Join Date: Mar 2014
Posts: 92

#11

30 Sep 2016, 10:04

Hmm, strange. When I type:

Code:

sysuse auto, clear

replace price = . in 10 //to check whether correct number of obs is used in bootstrap

capture program drop myboot
program myboot, eclass
    regress price mpg headroom
    matrix b = e(b)
    scalar obs = e(N)
    lincom mpg-headroom
    ereturn post b
    ereturn scalar newscal = r(se)
    ereturn scalar obs = obs
end

bootstrap, rep(1000) seed(342) saving(tmpfile.dta, replace): myboot
bootstrap mpg = _b[mpg] headroom = _b[headroom] lincom_se = e(newscal), noisily rep(5) seed(342): myboot

use tmpfile.dta, clear
list

I see the following::

Code:

     +----------------------------------+
      |    _b_mpg   _b_head~m    _b_cons |
      |----------------------------------|
   1. | -500.1654   -486.6479   18439.41 |
   2. | -273.0997    -704.849   13719.69 |
   3. | -162.5183    354.7137   8494.055 |
   4. | -296.1175   -382.2571   13435.47 |
   5. |  -287.231   -113.3691    12877.4 |
      |----------------------------------|
   6. | -383.2015   -792.4132   17233.15 |
   7. | -218.9065    27.17953   10390.05 |
   8. | -142.5916   -90.89295   9484.384 |
   9. | -251.7405    58.67559   11886.79 |
  10. | -284.7032   -322.3931   13167.93 |
      |----------------------------------|
  11. | -212.3769   -162.9946   10863.32 |
  12. | -206.7537   -453.1308    11803.1 |
  13. |  -296.789    160.0279   12081.51 |
  14. | -242.3181   -597.7308      13065 |
  15. | -244.5355   -581.6088   12611.82 |

I would have expected no column for the constant but instead a column for newscal.

Comment

Jesse Wursten

Join Date: Jan 2016
Posts: 915

#12

30 Sep 2016, 10:44

Code:

bootstrap, rep(1000) seed(342) saving(tmpfile.dta, replace): myboot
bootstrap mpg = _b[mpg] headroom = _b[headroom] lincom_se = e(newscal), noisily rep(5) seed(342): myboot

The code you're using to save doesn't define the coefficients to bootstrap.

Code:

 bootstrap mpg = _b[mpg] headroom = _b[headroom] lincom_se = e(newscal), saving(tmpfile.dta) noisily rep(5) seed(342): myboot

Comment

Roberto Liebscher

Join Date: Mar 2014

Posts: 92
#13

30 Sep 2016, 12:59

Thanks for clarifying. Must have been to blind to see that I had the bootstrap command twice. Thanks.
Comment

Announcement