Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Seed value across versions

    I recently updated from STATA 13 to STATA 15. Some of the do-files I produced earlier using STATA 13 set seed values before randomly sorting observations. However, when I try running the same do-files using STATA 15, I can no longer reproduce the work. Is there any way to convert seed values across versions 13 and 15?

  • #2
    Type

    Code:
    help set seed
    for various ways to do that. The simplest way might be to set

    Code:
    version 13
    ...
    which you should have done in your old do-files, anyway.

    Best
    Daniel

    Comment


    • #3
      Your problem likely arises from changes made to Stata's random number generator in Stata 14. This is described in the output of
      Code:
      help rng
      which suggests you can revert the random number generator to the previous version with
      Code:
      set rng kiss32
      prior to your
      Code:
      set seed
      Added in edit: The output of
      Code:
      help version
      tells us that a simple
      Code:
      version 13
      will not reset the random number generator; that approach would require
      Code:
      version 13, user
      which resets everything that is not reset by the command lacking the user option.

      My preference is to just set the rng using the first technique.
      Last edited by William Lisowski; 20 Feb 2019, 12:46.

      Comment


      • #4
        Thank you. set rng did the trick!

        Comment


        • #5
          Originally posted by William Lisowski View Post
          Added in edit: The output of
          Code:
          help version
          tells us that a simple
          Code:
          version 13
          will not reset the random number generator; that approach would require
          Code:
          version 13, user
          which resets everything that is not reset by the command lacking the user option.
          Not so. This is what the help file says

          If you type version # interactively or in your do-files, Stata not only understands old syntax, it backdates (removes) improvements made after #, such as those to the RNGs. You do not have to specify the user option. The modern version of Stata will still produce the same results as it produced in the past.
          (emphasizes added)

          Williams comment is still valuable, though. The help goes on

          Programmers: When you type version # in your programs and ado-files, Stata does not backdate the other improvements. If, for some reason, you want to force the other improvements to be backdated, specify the user option. Option user is seldom used except by those (say, developers at StataCorp) needing to test that Stata works properly.
          Anyway, the help for set seed, which I have pointed to in #2, lists both discussed solutions. Personal tastes differ, but I still like setting version (much) better than setting rng if only for simplicity, since you should set version anyway, which makes setting rng redundant. There is a third solution, which seems a bit cumbersome to type but which I like because of its explicit form: add _kiss32 to your commands. For example:

          Code:
          set seed_kiss32 42
          display runiform_kiss32()
          Best
          Daniel
          Last edited by daniel klein; 20 Feb 2019, 23:11.

          Comment


          • #6
            Originally posted by Saad Imtiaz View Post
            Some of the do-files I produced earlier using [Stata] 13 set seed values before randomly sorting observations. However, when I try running the same do-files using [Stata] 15, I can no longer reproduce the work.
            Aside from technical aspects that we have discussed above, results that depend on the sort order of observations make me suspicious. There are certainly situations, where this is expected, but there are also situations where there is a conceptual flaw in the approach or code. Make sure that in your situation the sort order is supposed to matter for the results before proceeding with a purely technical fix.

            Best
            Daniel
            Last edited by daniel klein; 21 Feb 2019, 00:00.

            Comment


            • #7
              Daniel is correct in post #5, I had mis-remembered how version control interacted with rng selection and encountered confirmation bias when I mis-read the help file. In general, simple version control will do the job in do-files and from the command window.

              I don't typically use version control, but it seems to me if one were testing a do-file that had been revised to use features specific to versions later than 13 then it would not be possible to use simple version control, and either of the other two solutions would be require to control the rng without ruling out the use of new syntax.

              Comment


              • #8
                Originally posted by William Lisowski View Post
                it seems to me if one were testing a do-file that had been revised to use features specific to versions later than 13 then it would not be possible to use simple version control, and either of the other two solutions would be require to control the rng without ruling out the use of new syntax.
                I do not believe so; but perhaps I am misinterpreting your statement. Version control ensures backwards compatibility; it does not disable features that might be added in the future. You can type

                Code:
                version 1
                cls
                to clear the results window in Stata 15 although cls was clearly not around in version 1. What version control does and does not do still seems to be a constant source of misunderstanding.

                Best
                Daniel

                Comment


                • #9
                  What version control does and does not do still seems to be a constant source of misunderstanding.
                  I appreciate this discussion. Version control is clearly deeper than I understand and I will steer clear of referring to it in future answers. I note the apparently exhaustive list of what it does in the bottom 95% of the output of help version but that's not an easy way of summarizing what it's likely to do as a general rule.

                  The use case I had in mind is post #7 was use of an option added in later versions. From the list in output of help version to which I previously referred (emphasis added):
                  Code:
                      If you set version to less than 15.1
                          ...
                          2.  estat esize has the following changes:
                          
                              a. Epsilon-squared is labeled using "Omega-Squared" and, when the omega option is
                                 specified, is reported with confidence limits.
                          
                              b. The epsilon option is not allowed.
                  And this is demonstrated by the following example.
                  Code:
                  . sysuse auto, clear
                  
                  . sysuse auto, clear
                  (1978 Automobile Data)
                  
                  . regress price foreign
                  
                        Source |       SS           df       MS      Number of obs   =        74
                  -------------+----------------------------------   F(1, 72)        =      0.17
                         Model |  1507382.66         1  1507382.66   Prob > F        =    0.6802
                      Residual |   633558013        72  8799416.85   R-squared       =    0.0024
                  -------------+----------------------------------   Adj R-squared   =   -0.0115
                         Total |   635065396        73  8699525.97   Root MSE        =    2966.4
                  
                  ------------------------------------------------------------------------------
                         price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                  -------------+----------------------------------------------------------------
                       foreign |   312.2587   754.4488     0.41   0.680    -1191.708    1816.225
                         _cons |   6072.423    411.363    14.76   0.000     5252.386     6892.46
                  ------------------------------------------------------------------------------
                  
                  . version
                  version 15.1
                  
                  . estat esize, epsilon
                  
                  Effect sizes for linear models
                  
                  ---------------------------------------------
                               Source | Epsilon-Squared      df
                  --------------------+------------------------
                                Model |    -.0114823          1
                                      |
                              foreign |    -.0114823          1
                  ---------------------------------------------
                  
                  . version 13
                  
                  . estat esize, epsilon
                  option epsilon not allowed
                  r(198);

                  Comment


                  • #10
                    Originally posted by William Lisowski View Post
                    I appreciate this discussion. Version control is clearly deeper than I understand and I will steer clear of referring to it in future answers. I note the apparently exhaustive list of what it does in the bottom 95% of the output of help version but that's not an easy way of summarizing what it's likely to do as a general rule.
                    Thanks for getting back and providing an example. I believe that my previous statement

                    Version control ensures backwards compatibility; it does not disable features that might be added in the future
                    remains correct but is (highly) misleading.

                    Obviously, I cannot know what the rules are inside StataCorp, but I believe it is safe to say this: The general rule for version control is: code that you write under version control (of the most recent version of Stata that you are using) continues to work in future versions of Stata. The exception is that bugs, that is, wrong results, will not be reinstated under version control. Further, version control does not (and cannot) imply anything about future changes made to Stata. In other words: disabling new features is not the intended behavior; it might, however, happen occasionally, if new features could break old code.

                    William's example in #9 is puzzling in this respect. I have no idea how an option that has not been previously available could break old code. I have no clue why this was put under version control.

                    Best
                    Daniel
                    Last edited by daniel klein; 21 Feb 2019, 23:12.

                    Comment

                    Working...
                    X