Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • 🐛 Suspected bug in frames handling of data

    Dear All,

    I think I have encountered a bug in Stata where using a temporary frame may result in data loss in the main data frame.

    The following is the minimal code that reproduces the problem:
    Code:
    clear all
    version 16.0
    sysuse nlsw88
    tempname tmp
    frame create `tmp'
    frame `tmp' : {
      generate int a=.
      generate double b=.
    }
    frame change `tmp'
    describe
    frame default: describe
    // end of file
    Note that at the execution of the last line (when the temporary frame still exists) the default data frame is still intact.
    After the termination of the do-file the temporary data frame is disposed of, and the frame named default becomes the active frame (as discussed earlier with William Lisowsky here).

    But, here is where the bug comes in: the default frame will retain only so many variables as were present in the temporary frame at the time it was disposed of. (Only 2 in this example). You can convince yourself of this by running another describe command from the command line after the do-file completes.

    This means, that if you were unfortunate to run a code like above, and then without inspection continue modifying your dataset, and then save the result, it will overwrite your source with a loss of data (if you didn't do backups of the source). The fact that some variables are retained aggravates the issue, since visually the data is still there, and you really have to check the inventory of the variables to find out what's going on.

    I have further found that:
    1) the flag indicating that the dataset has changed is not set in this case: display c(changed)
    2) workaround: manually switching the current frame to the default makes sure that the variables in it are not affected: frame change default

    I have not checked what will happen if the temp frame contained more variables than the default frame.

    This has been observed in
    • Stata MP 16.0(797) on Windows (64-bit x86-64), and further re-confirmed in
    • Stata MP 16.1(839) on Windows (64-bit x86-64).
    If the use of temporary data frames is discouraged or not-intended for any reason, please do let me know. For the moment I am assuming that the above syntax remains legit. If there is another explanation of the observed behavior, I would appreciate an explanation or a pointer to the manual.

    Thank you, and best regards,
    Sergiy

  • #2
    This example when run demonstrates that when a do-file is exited while not in the default frame, the number of variables retained in the default frame is limited to the number in the frame from which the do-file was exited. While this can be solved by always returning to the default frame before exiting the do-file, that's not practical advice since exiting is involuntary in the case of an error.
    Code:
    sysuse auto, clear
    tempname tmp
    frame create `tmp'
    frame change `tmp'
    generate int a=.
    generate double b=.
    save "nonexistent/newdata", replace
    The same code copied and pasted into Stata's Command window does not have any effect on the number of variables retained in either frame.

    I think you'd be justified in pointing Stata Technical Services to this discussion as evidence of an unintended outcome.

    Comment


    • #3
      This and Sergiy's previous thread lead me to think - is the frame prefix method (that is, -frame:- or -frame {}-) intended to be a perfect alternative for manually changing between frames? Either way, I think this intention needs to be made clear in the documentation. The issue of macro expansion previously raised leads me to believe they are subtly different.

      Comment


      • #4
        Dear William,

        thank you very much for confirming the issue and exploring the behavior under alternative code layout. (although I am not entirely sure what it tests, since the bug manifests itself at the disposal of the temporary frame, so if the tempname is allocated in the command line, it is not disposed at all).
        In any case I agree that ensuring the proper return frame would not be trivial in a real project with multiple temporary frames and multiple exit points from a procedure, as well as will be non-obvious requirement for an unsuspecting user.

        I have simultaneously notified the Stata Support regarding the issue.

        Sincerely, Sergiy Radyakin

        Comment


        • #5
          Leonardo Guizzetti ,

          on your question, I think the frame NAME {} prefix/syntax is perfectly fine since it is actually the frame change NAME that is causing the problem on the disposal of the active frame of temporary nature.

          I did further searches on the macro expansion in the context of the frame modifier and found an explanation of Alan Riley (StataCorp) here: frame post not working after last update producing the same explanation as in my not obvious behavior post.

          That said, despite some minor hiccups frames in Stata do rock! And python rocks too!

          Best, Sergiy Radyakin



          Comment


          • #6
            Thanks, Sergiy. On that note, I wonder if the problem isn't perhaps -frame- syntax at all, but rather how Stata handles deletion of a temporary frame. If I append -frame change default- to the end of your test script, then the temporary frame is properly disposed of and the data restored. In either case, I agree that frames are awesome, and that this is a bug that needs to be addressed.

            Comment


            • #7
              Originally posted by Sergiy Radyakin View Post
              Dear All,

              I think I have encountered a bug in Stata where using a temporary frame may result in data loss in the main data frame.

              The following is the minimal code that reproduces the problem:
              Code:
              clear all
              version 16.0
              sysuse nlsw88
              tempname tmp
              frame create `tmp'
              frame `tmp' : {
              generate int a=.
              generate double b=.
              }
              frame change `tmp'
              describe
              frame default: describe
              // end of file
              Note that at the execution of the last line (when the temporary frame still exists) the default data frame is still intact.
              After the termination of the do-file the temporary data frame is disposed of, and the frame named default becomes the active frame (as discussed earlier with William Lisowsky here).

              But, here is where the bug comes in: the default frame will retain only so many variables as were present in the temporary frame at the time it was disposed of. (Only 2 in this example). You can convince yourself of this by running another describe command from the command line after the do-file completes.

              ...
              Good timing! This was first reported to us at the end of last week, and we discussed in this morning in our development meeting that kicks off the week. This does appear to be a bug, and we will address it in an upcoming update.

              Comment


              • #8
                Dear Alan,

                thank you very much for confirming the bug status and incorporating the fix into the updates schedule.

                Sincerely, Sergiy Radyakin

                Comment

                Working...
                X