Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Return a Scalar and a Matrix from an eclass() Program

    Dear Statalisters,

    I would like to write an eclass program which is to be passed to the bootstrap command. In this program I would like to estimate a model followed by some tests which results I would like to store in addition to the coefficient vector. To give an example:
    Code:
    sysuse auto, clear
    
    replace price = . in 10 //to check whether correct number of obs is used in bootstrap
    
    capture program drop myboot
    program myboot, eclass
        tempname b newscal
        regress price mpg headroom i.rep78
        matrix `b' = e(b)
        lincom mpg-headroom
        ereturn scalar `newscal' = r(se)
        ereturn post `b'
    end
    
    bootstrap, rep(1000) seed(342): myboot
    If I type ereturn list I cannot find the scalar. Could somebody please tell me what went wrong here?

  • #2
    Don't use a temporary name, as that dooms the object to transience.

    Comment


    • #3
      Thanks Nick Cox for this hint. However, when I try the code without setting temporary names it still does not return the scalar. Maybe there is something to add to the bootstrap command?
      Code:
      sysuse auto, clear
      
      replace price = . in 10 //to check whether correct number of obs is used in bootstrap
      
      capture program drop myboot
      program myboot, eclass
          regress price mpg headroom i.rep78
          matrix b = e(b)
          lincom mpg-headroom
          ereturn scalar newscal = r(se)
          ereturn post b
      end
      
      bootstrap, rep(1000) seed(342): myboot

      Comment


      • #4
        Have you tried doing the ereturn post before the ereturn scalar? That seems to work for me. From help ereturn: ereturn post clears all existing e-class results and stores the coefficient [...]
        Code:
        sysuse auto, clear
        
        replace price = . in 10 //to check whether correct number of obs is used in bootstrap
        
        capture program drop myboot
        program myboot, eclass
            regress price mpg headroom i.rep78
            matrix b = e(b)
            lincom mpg-headroom
            ereturn post b
            ereturn scalar newscal = r(se)
        end
        
        myboot
        ereturn list
        Code:
        . ereturn list
        
        scalars:
                    e(newscal) =  410.6772154933516
        
        macros:
                 e(properties) : "b"
        
        matrices:
                          e(b) :  1 x 8
        Last edited by Jesse Wursten; 29 Sep 2016, 05:39.

        Comment


        • #5
          Jesse Wursten Great idea, this works. Do you have an explanation for this behavior? Also, is there a way to validate that the results saved in newscal is the average of all the single estimations and not just the results of the estimation with the last bootstrap sample? I am asking, because when I look into the stored results in the statafile tmpfile I cannot see the single results.

          Code:
          sysuse auto, clear
          
          replace price = . in 10 //to check whether correct number of obs is used in bootstrap
          
          capture program drop myboot
          program myboot, eclass
              regress price mpg headroom
              matrix b = e(b)
              scalar obs = e(N)
              lincom mpg-headroom
              ereturn post b
              ereturn scalar newscal = r(se)
              ereturn scalar obs = obs
          end
          
          bootstrap, rep(1000) seed(342) saving(tmpfile.dta, replace): myboot
          Here is what I see in tmpfile:
          Code:
            |    _b_mpg   _b_head~m    _b_cons |
            |----------------------------------|
            | -500.1654   -486.6479   18439.41 |
            | -273.0997    -704.849   13719.69 |
            | -162.5183    354.7137   8494.055 |
            | -296.1175   -382.2571   13435.47 |
            |  -287.231   -113.3691    12877.4 |
            |----------------------------------|
            | -383.2015   -792.4132   17233.15 |
            | -218.9065    27.17953   10390.05 |
            | -142.5916   -90.89295   9484.384 |
            | -251.7405    58.67559   11886.79 |
            | -284.7032   -322.3931   13167.93 |

          Comment


          • #6
            ereturn post clears everything in e(), so the scalar would be removed if you do the post part after.

            As for the bootstrap, I have to admit I have no idea as I never work with the command. You could specify the "noisily" option and check manually with just a handful of reps?

            Comment


            • #7
              Thanks for clarification, Jesse. After some attempts I finally found something (maybe ugly) that is working. I simply add my other estimation results to the coefficient vector. The drawback of my approach is that I have to drop missing observations before the bootstrap starts, something I was hoping Stata takes automatically care of (because it is an eclass command). But I can live with that for the moment. If someone has a nicer approach I would be happy to hear.

              Code:
              sysuse auto, clear
              
              replace price = . in 10 //to check whether correct number of obs is used in bootstrap
              
              egen rowmiss = rowmiss(price mpg headroom)
              keep if rowmiss == 0
              
              capture program drop myboot
              program myboot, eclass
                  regress price mpg headroom
                  matrix b = e(b)
                  scalar obs = e(N)
                  lincom mpg-headroom
                  matrix addmat = [obs, r(se)]
                  matrix b = [b, addmat]
                  ereturn post b
              end
              
              bootstrap, rep(1000) seed(342) saving(tmpfile.dta, replace): myboot

              Comment


              • #8
                You can manually specify what the bootstrap command should work on.
                Code:
                bootstrap mpg = _b[mpg] headroom = _b[headroom] lincom_se = e(newscal), noisily rep(5) seed(342): myboot
                Code:
                      command:  myboot
                          mpg:  _b[mpg]
                     headroom:  _b[headroom]
                    lincom_se:  e(newscal)
                
                ------------------------------------------------------------------------------
                             |   Observed   Bootstrap                         Normal-based
                             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                -------------+----------------------------------------------------------------
                         mpg |  -260.1958   72.37625    -3.60   0.000    -402.0506   -118.3409
                    headroom |  -312.3023   332.5286    -0.94   0.348    -964.0464    339.4418
                   lincom_se |   380.0559   45.83123     8.29   0.000     290.2284    469.8835
                ------------------------------------------------------------------------------
                As for missing observations, note the warning bootstrap gives you. As in, it explicitly doesn't drop the observations containing some missings.
                Code:
                Warning:  Because myboot is not an estimation command or does not set e(sample), bootstrap has no way to determine
                          which observations are used in calculating the statistics and so assumes that all observations are used.
                          This means that no observations will be excluded from the resampling because of missing values or other
                          reasons.
                This makes sense, as the bootstrap doesn't know which variables are relevant. Or in some cases, different variables are relevant for different parts of the command. It is straightforward to drop them yourself though.

                Code:
                drop if missing(price, mpg, headroom)

                Comment


                • #9
                  Thanks Jesse Wursten. The only problem I see with this approach is that the scalars will not be saved when using the saving() option.

                  Comment


                  • #10
                    Originally posted by Roberto Liebscher View Post
                    Thanks Jesse Wursten. The only problem I see with this approach is that the scalars will not be saved when using the saving() option.
                    I just tried and they are saved?

                    Comment


                    • #11
                      Hmm, strange. When I type:
                      Code:
                      sysuse auto, clear
                      
                      replace price = . in 10 //to check whether correct number of obs is used in bootstrap
                      
                      capture program drop myboot
                      program myboot, eclass
                          regress price mpg headroom
                          matrix b = e(b)
                          scalar obs = e(N)
                          lincom mpg-headroom
                          ereturn post b
                          ereturn scalar newscal = r(se)
                          ereturn scalar obs = obs
                      end
                      
                      bootstrap, rep(1000) seed(342) saving(tmpfile.dta, replace): myboot
                      bootstrap mpg = _b[mpg] headroom = _b[headroom] lincom_se = e(newscal), noisily rep(5) seed(342): myboot
                      
                      use tmpfile.dta, clear
                      list
                      I see the following::

                      Code:
                           +----------------------------------+
                            |    _b_mpg   _b_head~m    _b_cons |
                            |----------------------------------|
                         1. | -500.1654   -486.6479   18439.41 |
                         2. | -273.0997    -704.849   13719.69 |
                         3. | -162.5183    354.7137   8494.055 |
                         4. | -296.1175   -382.2571   13435.47 |
                         5. |  -287.231   -113.3691    12877.4 |
                            |----------------------------------|
                         6. | -383.2015   -792.4132   17233.15 |
                         7. | -218.9065    27.17953   10390.05 |
                         8. | -142.5916   -90.89295   9484.384 |
                         9. | -251.7405    58.67559   11886.79 |
                        10. | -284.7032   -322.3931   13167.93 |
                            |----------------------------------|
                        11. | -212.3769   -162.9946   10863.32 |
                        12. | -206.7537   -453.1308    11803.1 |
                        13. |  -296.789    160.0279   12081.51 |
                        14. | -242.3181   -597.7308      13065 |
                        15. | -244.5355   -581.6088   12611.82 |
                      I would have expected no column for the constant but instead a column for newscal.

                      Comment


                      • #12
                        Code:
                        bootstrap, rep(1000) seed(342) saving(tmpfile.dta, replace): myboot
                        bootstrap mpg = _b[mpg] headroom = _b[headroom] lincom_se = e(newscal), noisily rep(5) seed(342): myboot
                        The code you're using to save doesn't define the coefficients to bootstrap.

                        Code:
                         bootstrap mpg = _b[mpg] headroom = _b[headroom] lincom_se = e(newscal), saving(tmpfile.dta) noisily rep(5) seed(342): myboot

                        Comment


                        • #13
                          Thanks for clarifying. Must have been to blind to see that I had the bootstrap command twice. Thanks.

                          Comment

                          Working...
                          X