Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by FernandoRios View Post
    I think the problem of using stubs to locals is a problem because those variables do not behave as locals anymore.
    I encounter this problem when I wanted to create multiple tempvars, without declaring them explicitly.
    For example:
    HTML Code:
    tempvar newvar
    gen `newvar' =0
    gen `newvar'1 =1
    gen `newvar'2 =2
    If you run this within a program or subprogram, the tempvar "newvar" will not stay in your data. However, `newvar'1 and `newvar'2, will remain there. It is then the programmer who has to figure out how to do the clean-up.
    That is a problem I also encountered recently. Consider for example the situation where you want to create multiple scores at once with the predict postestimation command, which requires to specify a stub. What I did is the following (highly simplified example), where I create temporary variable names with a temporary name stub:
    Code:
    program myscores
        tempname score
        tempvar `score'1 `score'2
        predict `score'*, scores
        sum `score'*
    end
    
    webuse hsng2
    ivregress 2sls rent pcturban (hsngval = faminc i.region)
    myscores
    The temporary score variables are not kept in the data set once the myscores program is exited. Is there anything problematic with this approach?
    https://twitter.com/Kripfganz

    Comment


    • #17
      #16 Originally variable names could only be 8 characters long. When temporary names were added Stata also used names 8 characters long and this is still true: It starts at __000000 (I guess it checks whether you are using that yourself, but I did not try that.)

      Code:
      . tempname foo
      
      . di "`foo'"
      __000000
      So with Stata's own rule that temporary variable names start with a double underscore and are 8 characters long there is scope for at least (10 + 26 + 26 + 1)^6 different temporary names, using 0-9 a-z A-Z _ as possible characters. That seems more than enough for the predictable future.

      So, my guess is that Stata will not change those rules. But even if say the company decided to make temporary names 16 characters long, you could still add little subscripts.

      My wild guess is that enough user-programmers have discovered this trick that you're safe. The worst I can imagine is under version control Stata would allow the long-continued practice indefinitely

      Comment


      • #18
        I have no idea what Stata is doing in the code Sebastian Kripfganz shows. If you run

        Code:
        webuse hsng2
        ivregress 2sls rent (pcturban hsngval = faminc i.region)
        myscores
        there are 3 scores, and yet Stata cleans up everything, although Sebastian has declared only two tempvars.

        Similarly if you run
        Code:
        webuse hsng2
        ivregress 2sls rent (pcturban hsngval popgrow = popden faminc i.region)
        myscores
        so there are 4 scores, again everything is fine and Stata cleans up everything although only two of the tempvars are declared.

        However if you change Sebastian code to read

        Code:
        cap prog drop myscores
        program myscores
            tempname score
            tempvar `score'1 
            predict `score'*, scores
            sum `score'*
        end
        
        webuse hsng2
        ivregress 2sls rent  popden faminc i.region
        myscores
        Stata does not clean up the space even though we have exactly 1 score, and we have defined exactly one tempvar.

        Comment


        • #19
          Thank you for this very useful information and for giving us confidence in attaching stubs, Nick.

          However I would like to take a step back to the major problem that became apparent here, and to hear your opinion on it.

          The problem is that the manual says that Stata checks whether the tempvars and tempnames exist already, but Stata does not check for these things in fact: see Daniel Klein #9, my example in #11, and the whole thread by Dirk Enzmann https://www.statalist.org/forums/for...ngerous-advice.

          Stata always starts deteministically from __000000, and the only thing that Stata checks is that within the temvars and tempnames she assigns there are no clashes.

          So basically the whole safety of the whole temp facility is based on the hope that no normal person would ever choose themselves names for variables and scalars such as __000000 and __000001, etc. It does not matter how many possibilities in total two underscores plus 6 characters offer, if Stata starts deterministically always from __000000. All it takes is a user who also starts deterministically from __000000 to assign names, and there will be plenty of clashes.


          Originally posted by Nick Cox View Post
          #16 Originally variable names could only be 8 characters long. When temporary names were added Stata also used names 8 characters long and this is still true: It starts at __000000 (I guess it checks whether you are using that yourself, but I did not try that.)

          Code:
          . tempname foo
          
          . di "`foo'"
          __000000
          So with Stata's own rule that temporary variable names start with a double underscore and are 8 characters long there is scope for at least (10 + 26 + 26 + 1)^6 different temporary names, using 0-9 a-z A-Z _ as possible characters. That seems more than enough for the predictable future.

          So, my guess is that Stata will not change those rules. But even if say the company decided to make temporary names 16 characters long, you could still add little subscripts.

          My wild guess is that enough user-programmers have discovered this trick that you're safe. The worst I can imagine is under version control Stata would allow the long-continued practice indefinitely
          Last edited by Joro Kolev; 23 Mar 2021, 06:56.

          Comment


          • #20
            It's the nature of any language that many things not forbidden are a bad idea and can bite you. Back at [U] 12.3 there is clear advice

            The first character of a name must be a letter or an underscore (macro names are an exception;
            they may also begin with a digit). We recommend, however, that you not begin your variable names
            with an underscore. All of Stata’s built-in variables begin with an underscore, and we reserve the
            right to incorporate new variables freely.
            That does not mention temporary names -- this is at an early stage of the documentation -- but whoever wrote that was cleverly giving advice that was in readers' best interests there too.

            It's clear that the company's intent was to use very unlikely temporary names but very unlikely names are not impossible. Also, easy advice but harder to follow, programmers should read enough of the documentation -- or equivalently enough Stata code -- to be fairly sure of what they are doing.

            As I said, I've not checked what Stata checks. Abstractly, you've identified a problems. In practice, this is on the level of don't play with fire or sharp knives unless you're sure what you're doing.

            Comment

            Working...
            X