Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How do I break out from -simulate-, without losing the data from the particular simulation round, to inspect what went wrong?

    Good afternoon,

    I run a simulation through -simulate-. Pseudo code looks something like this:

    prog define myprog
    generate data
    estimation command 1
    return sca B1=_b[]
    estimation command 2
    return scalar B2=_b[]
    end

    simulate B1 B2: myprog

    I do not have any -capture- statements in my code. Some estimation rounds go wrong (ill conditioned data, it is not a problem of the program, the program is fine), and I see red Xs in the dots. Then I see that I have some missing values of the estimators that I am collecting through -simulate-. The simulation completes right through the errors, which probably is a good default behaviour.

    How can I break the simulation when something is wrong, so that I can inspect manually what went wrong? When I press the BREAK button, the simulation stops and I can see the data generated and used for the particular simulation round. How can I do the same with a line of code?

    I tried inserting below each estimation command in my simulation program statements such as

    if _rc!=0 exit
    if _rc!=0 break
    e.g.,

    prog define myprog
    generate data
    estimation command 1
    if _rc!=0 exit
    return sca B1=_b[]
    estimation command 2
    if _rc!=0 exit
    return scalar B2=_b[]
    end

    simulate B1 B2: myprog

    the simulation goes right through them, it seems that they do not make any difference.

    Do you have any suggestions?

  • #2
    look at help simulate in particular, the following option and its suboptions such as every(#)
    Code:
       saving(filename [, suboptions]) creates a Stata data file (.dta file)
            consisting of (for each statistic in exp_list) a variable containing
            the replicates.
    
            See prefix_saving_option for details about suboptions.

    Comment


    • #3
      Hi Stephen,

      The issue is different, and similar to what you have discussed here
      https://www.stata.com/statalist/arch.../msg01186.html

      I am wondering how can I replicate the behaviour of pressing manually the Break key, however within my program which I am executing repeatedly by simulate with some statement such as
      if _rc!=0 break
      (this did not work, neither
      if _rc!=0 exit
      worked)
      .
      While I am executing the program repeatedly by simulate, something goes wrong with the data, and my estimators in my program are not able to produce valid results. When this happens, I want Stata to stop everything and to return the system to me in whichever state the system was when the error occured, so that I can manually examine what is wrong.

      Comment


      • #4
        Here is a simple example of code that illustrates the situation. On round 3 the data is sufficiently bad so that the estimator B cannot be computed.

        I wonder how do I tell Stata to stop whatever it is doing when the estimator B cannot be computed, and to return the system to me in whichever state the system was when the error occurred for inspection, .

        Code:
        . cap prog drop myprog
        
        . 
        . set more off
        
        . 
        . program define myprog, rclass
          1. 
        . drop _all
          2. 
        . set obs 4
          3. 
        . gen x = runiform()
          4. 
        . gen e = rnormal()
          5. 
        . gen lne = log(e)
          6. 
        . gen y = 1*x + lne
          7. 
        . regress y x
          8. // None of the next two statements seem to make any difference ! Simulate goes right through them as if they 
        > were not there. 
        . if _rc!=0 exit
          9. if _rc!=0 break
         10. 
        . return sca B = _b[x]
         11. 
        . end
        
        . 
        . 
        . simulate B=r(B), reps(10) seed(11111): myprog
        
              command:  myprog
                    B:  r(B)
        
        Simulations (10)
        ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
        ..x.x.....

        Comment


        • #5
          Code:
          . summ B
          
              Variable |        Obs        Mean    Std. Dev.       Min        Max
          -------------+---------------------------------------------------------
                     B |          8    3.371041    7.128188  -3.251266      15.06
          There were 10 simulation rounds, but only 8 valid replicas of the B estimator, because 2 of the simulation rounds failed.

          Comment


          • #6
            Apologies for misunderstanding your request. (And you must excuse me for not remembering that 2006 exchange!)

            The simulation completes right through the errors, which probably is a good default behaviour.
            I think I agree. In my simulation experience, I was using maximum likelihood estimators that were set to run with a very large maxit limit. "not working" (an "x" in the dots output) meant non-convergence and appropriate behaviour in the context of the analysis of the estimator.

            How can I break the simulation when something is wrong, so that I can inspect manually what went wrong?
            Do you really want to break immediately you get your first "x"? One option might be to run your simulations as before, but also saving the current seed along with the output for each replication (as well as a flag for "didn't worked" (however you define that). Then afterwards you could inspect the simulation output and re-run the non-working replications and look at them in more detail. How to save the seed has come up in several threads of Statalist, I think

            Comment


            • #7
              For my particular problem, if there is a way how to save the current/active seed dependent on some condition being satisfied, would do the trick I guess. This way we basically move the X to first position, and we can execute the program only one time without simulate at the seed at which the problem occurred to investigate the problem. But I do not know how to do that.

              I think, and this goes back to your remarks in the old thread and the surrounding discussion, that we do not have good documentation of how to exit and break loops.

              I am using Stata since year 2000, and I just learnt a couple of days ago that I can exit loops by methods such as

              forvalues i = 1/10000 {
              do some stuff
              if some_condition {
              continue, break
              }

              And as far as I can see there is no systematic discussion of breaking and/or exiting loops, all of those -exit- , -break- , -continue-, are documented at various places.

              For example when I type
              help exit
              something unrelated shows up. It is an explanation how I exit Stata, not how I exit loops.

              When I type
              help break
              again some unrelated discussion appears of how to suppress the break key.

              Comment


              • #8
                I can't help you on the stuff about breaking out of loops, and I suspect that Technical Support may be the place to write.

                However, for addressing the problem you have, I don't see why your proposed strategy is the best one. Why can't you adapt your myprog.ado so that at each replication it saves, in addition to the usual stuff, the current seed and the current state of whatever condition you want satisfied (what I referred to as a flag variable). Once the simulation has finished and created a dataset you can select the problematic observations with reference to the flag variable, and re-run each problematic estimation once using the relevant seed value.

                On saving seeds in the data set, see posts like https://www.statalist.org/forums/for...utput-and-seed. Also look at https://robertgrantstats.wordpress.c...n-study-seeds/, though that refers to the older random number generators (things have changed in later Stata versions). Have a look also at https://www.stata.com/meeting/uk16/s...orris_uk16.pdf from around slide 30 onwards. Also the recent correspondence on Statalist at https://www.statalist.org/forums/for...r-and-programs. You want to get the seed and the current state of the condition saved along with the simulation results, so that'll likely require putting them into saved results macro as part of myprog.ado

                I've reached my knowledge limit on all this, so am bailing out now. Good luck.

                Comment

                Working...
                X