Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Unknown syntax error when using mfp within bsample bootstrap loop

    Hello Statalist,

    I am using Stata version 16.1 and I am currently in the beginning stages of developing a predictive model in which I am using the "mfp" command with the "select" option to develop models (where I can then save model performance statistics to be used later) within bootstrapping loops using the "bsample"command. For some reason I keep getting error messages pertaining to the estimation of the fractional polynomial at different iterations (stages) of my bootstrap. When I commented out the "bsample" my code runs smoothly (albeit 20 times on the same data/model). I used the auto.dta file originally in my do-file to ensure my code was correct before applying it to my real data and decided to only perform 20 iterations (to save time while testing, as opposed to 200 for my actual work).

    The code I used for the bootstrap in Stata is as follows:

    Code:
    clear
    local boots = 20
    clear
    set obs `boots'
    quietly {
    forvalues i = 1(1)`boots' {
    if floor((`i'-1)/5) == (`i'-1)/5 {
    noisily display "working on `i' out of `boots' at $S_TIME"
    }  /*I used the above small loop and display command to know where in my bootstrap I am*/
    preserve
    sysuse auto.dta, clear
    bsample /*when I comment this out my code runs/outputs fine*/
    mfp, select(0.156, mpg:1): logistic foreign price length mpg rep78
    /*multiple rows of code follow where I produce and save my performance statistics*/
    restore
    }
    }
    The error messages sometimes occur after 1 iteration right away or on other attempts I have made it through 16 iterations before the error message pops up. The error message from Stata is as follows:

    Code:
    failure encountered when estimating fractional polynomial model
    frac_154 logistic foreign mpg Irep7_1 Ipric_1 Ieng_1 if _000002 , degree(2)
    powers(-2,-1,-.5,0,.5,1,2,3) name(Impg_)
    invalid syntax
    I'm quite baffled as to why I am having this issue and why it is happening at different stages of my bootstraps. Any help would be greatly appreciated.

    Connor

  • #2
    s
    Last edited by Ben Young; 29 May 2020, 12:14.

    Comment


    • #3
      This code reliably reproduces the error on my system.
      Code:
      about
      clear
      set seed 123456789
      local boots = 20
      clear
      set obs `boots'
      quietly {
      forvalues i = 1(1)`boots' {
      // if floor((`i'-1)/5) == (`i'-1)/5 {
      noisily display "working on `i' out of `boots' at $S_TIME"
      // }  /*I used the above small loop and display command to know where in my bootstrap I am*/
      preserve
      sysuse auto.dta, clear
      bsample /*when I comment this out my code runs/outputs fine*/
      mfp, select(0.156, mpg:1): logistic foreign price length mpg rep78
      /*multiple rows of code follow where I produce and save my performance statistics*/
      restore
      }
      }
      Code:
      . about
      
      Stata/SE 16.1 for Mac (64-bit Intel)
      Revision 28 Apr 2020
      ... 
                            
      . clear
      
      . set seed 123456789
      
      . local boots = 20
      
      . clear
      
      . set obs `boots'
      number of observations (_N) was 0, now 20
      
      . quietly {
      working on 1 out of 20 at 14:09:16
      working on 2 out of 20 at 14:09:25
      working on 3 out of 20 at 14:09:36
      working on 4 out of 20 at 14:09:47
      failure encountered when estimating fractional polynomial model
      frac_154 logistic foreign rep78   Ileng__1 Ileng__2 Impg__1 Ipric__1 Ipric__2 if __000002  , deg
      > ree(1) powers(-2,-1,-.5,0,.5,1,2,3) name(Irep7_) 
      invalid syntax
      r(198);

      Comment


      • #4
        The code in post #2 can be reduced to the following reproducible example (try it on your system, please) - the four bsample commands are what it takes to get to the unlucky bootstrap sample that precipitates the error.

        At this point, I think you would be justified in submitting this question to Stata Technical Services. The error message is uninformative - it refers to a program called by the program you called, so the reported syntax error would be on a command written by StataCorp, not on the mfp command you submitted, which runs correctly with other bootstrap samples of auto.dta.
        Code:
        about
        set seed 123456789
        clear all
        sysuse auto.dta
        bsample
        bsample
        bsample
        bsample
        mfp: logistic foreign price length mpg rep78
        Code:
        . about
        
        Stata/SE 16.1 for Mac (64-bit Intel)
        Revision 28 Apr 2020
        ...                      
        
        . set seed 123456789
        
        . clear all
        
        . sysuse auto.dta
        (1978 Automobile Data)
        
        . bsample
        
        . bsample
        
        . bsample
        
        . bsample
        
        . mfp: logistic foreign price length mpg rep78
        
        Deviance for model with all terms untransformed =    35.036, 69 observations
        
        Variable     Model (vs.)   Deviance  Dev diff.   P      Powers   (vs.)
        ----------------------------------------------------------------------
        length       lin.   FP2      35.036    35.036  0.000+   1         -2 -2
                     FP1             30.343    30.343  0.000+   3        
                     Final            0.000                     -2 -2
        
        price        lin.   FP2       0.000     0.000  1.000    1         -2 -2
                     Final            0.000                     1
        
        failure encountered when estimating fractional polynomial model
        frac_154 logistic foreign mpg   Ileng__1 Ileng__2 Ipric__1 rep78 if __000002  , degree(2) powers
        > (-2,-1,-.5,0,.5,1,2,3) name(Impg_)
        invalid syntax
        r(198);
        Last edited by William Lisowski; 29 May 2020, 13:02.

        Comment


        • #5
          Hello William Lisowski,

          I really appreciate the clear and thoughtful answer and you taking the time to work through the issue. I will look into raising this issue with Stata Technical Services on Monday. I'd be more than happy to keep you informed as to how this plays out and I will also try to find a potential work around and share it here. Thanks again for the help

          Connor

          Comment


          • #6
            Hi all,

            I have been back and forth with Stata Technical Services and have had the issue resolved today, receiving the following e-mail from the person at Stata who I had been in contact with:

            "Dear Connor,

            We've looked into the issue you have reported. The error was
            produced because some of the combinations of variables were
            producing perfect predictions for -logit-, and -mfp- wasn't
            catching that. The fix will be available in a future update,
            with -mfp- providing a proper error message."


            Furthermore, while continuing to work on my code I found that I was able to run the code (and subsequent bootstrap loop) using either:
            1. Less variables in the mfp:logistic command line for developing the model
            2. using a sample dataset with more observations (as auto.dta only bases the model on 69 observations)

            This was not an issue for me as I was using the Stata system auto.dta just to make sure my code did what I was expecting it to do before starting analysis on my actual study data, so reducing variables in the model/increasing sample size worked. However, this could remain an issue for individuals aiming to use this working with quite small datasets (or many potential variables for model development), but I believe following the update the error message from Stata should be more informative.

            Hope this is helpful and I would be happy to lend (my albeit limited) expereince with mfp/bootstrapping loops to anyone interested. Thanks again to William Lisowski for highlighting the error.

            Connor

            Comment


            • #7
              Dear Connor,

              Thank you again for alerting us to this problem. To follow up on your correspondence with my colleague, I wanted to let you know that the bugfix she mentioned will be included in the next update. After the update, when mfp fits a logistic model that predicts the outcome perfectly, Stata will issue an informative error message and display the results of the failed model to help you identify problems.

              Comment

              Working...
              X