Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Question on the possibility of testing impacts of "set varabbrev off, permanently"

    Dear Stata Forum users,

    I have recently encountered an issue in my Stata code, whereby the abbreviation was causing an issue depending on the order of renames:
    Code:
    capture rename X Y
    capture rename X1 Y1
    In the above instance if I do not have the variable X, but I do have X1, X1 gets renamed to "Y" and the second line fails.

    The solution was to use "set varabbrev off". However, the entire stata script (a series of .do files linked using "include/do") is thousands and thousands of lines of code.One concern is that setting abbreviations off may solve this particular issue, but cause issues elsewhere in the code. I was hoping to run the code with abbreviations on and off for comparison and testing (i.e., do one run each and compare outcomes), but due to the nature of the code (we have many "capture" lines as for any specific execution, only some lines are used, think a complicated model with constantly changing inputs) it is quite difficult to properly test if setting abbreviations off will not impact the code.

    One solution is to use "set varabbrev off" then "set varabbrev on" after the rename, but I was hoping to use "set varabbrev off, perm" in the starting lines to entirely do away with abbreviations which might hurt the code executions later on down the line.

    My question is: Does Stata keep a count of how many abbreviations are used, or does it record anything when an abbreviation is used?

    If no, would anyone have any good strategies to figure out how changing this setting might impact the stata code?

    Thank you for your time and patience,
    Henry

  • #2
    For a specific command line, you can:

    Code:
    novarabbrev{
        capture rename X Y
        capture rename X1 Y1
    }

    Comment


    • #3
      Thank you Andrew! we have implemented something similar by "set varabbrev off" before the rename and "set varabbrev on" after. However, I was hoping to "set varabbrev off , perm" at the start of the script and leave it there. I can foresee further issues with abbreviations as we develop this model further, so I wanted to "nip the issue in the bud", so to speak. Right now, the issue is that there is already a ton of code to review and go through to assess whether turning abbreviations off would change anything, and it can be quite painful (and time consuming!) to do that. What I was curious about was whether there is an easier way to see how setting abbreviations off would impact the code. For example, if STATA recorded how many abbreviations were used in an object, I could simply retrieve that object for different sections of code and compare. In other words "is there a clever and time efficient way to figure out if setting abbreviations off will mess up my code?". Thanks again!

      Comment


      • #4
        There are no such tools like you’re imagining. Variable abbreviations are decoded/expanded at run-time based on the existing variables at that time. It seems like you might be able to avoid the issues of abbreviations entirely by using some combination of writing full variable names, expanding variable abbreviations in advance (help unab), or defensive programming.

        In you example in #1, couldn’t you avoid the whole problem by reverting the order of your two rename statements? Could you would with different variable names otherwise?
        Last edited by Leonardo Guizzetti; 18 Jun 2022, 10:46.

        Comment


        • #5
          If you permanently turn off variable abbreviation, any potential issue will manifest itself as an error. You will not notice this if you suppress the error messages using capture. Otherwise, I do not foresee any hidden effects, but others may have different ideas.

          Comment


          • #6
            What Andrew Musau says in #5 is true as far as it goes. But it obscures a larger point. If you -set varabbrev off, permanently-, and if your code calls programs that were not written with that in mind, those programs may fail. Yes, if not -capture-d you will learn about it through aborted execution and error messages. But you may then have to trace deep into the call chain to figure out how to fix that problem.

            FWIW, I have always thought that variable abbreviation (except explicit abbreviation with wildcards to refer to groups of variables) is a bad idea--a setup for obscure bugs. I never rely on it in my own code. But others love to save keystrokes and use it liberally, and it is sometimes used in programs written by others that I use. So I have learned, reluctantly, to live with it.

            What I think is a bigger problem in what you show in #1 is an unsafe use of -capture-. The purpose of -capture- is to allow Stata to skip over some code if it is expected, and is not a problem, that some condition(s) for the successful execution of the -capture-d code will arise. But -capture- is a very blunt instrument: it allows Stata to ignore any problem that arises in that code, even ones that are not anticipated and are not OK. For that reason, in most situations, the use of -capture- should be followed by testing the captured return code to verify that the reason the command was skipped is the one that was anticipated and acceptable, and not some other problem. Thus the structure would be:

            Code:
            capture command or block
            if !inlist(c(rc), 0, comma-separated list of return codes associated with allowable exceptions) {
                display as error "Unanticipated error incommand."
                exit c(rc)
            }
            (There are variations on this theme. Sometimes the originally -capture-d command was itself a -confirm- or -assert- meant to determine if subsequent commands are to be executed, in which case a separate branch for c(rc) == 0 to execute those subsequent commands would appear in the code.)

            Anyway, the use of -capture- without some kind of additional structure is unsafe in most circumstances.

            Comment


            • #7
              I could not agree more with what Clyde says about the proper use of capture. I do like variable abbreviations for interactive/command-line use, but I came to prefer no variable abbreviations in do-files (or programs). In your situation, I would set varabbrev off in the master do-file and review all occurrences of capture in the code. It will be painful and time-consuming but it will save you time and trouble in the future when you built upon your existing script.

              I have two more things to say about the variable abbreviations setting. First,

              Code:
              set varabbrev off , premanently
              will change the system settings of Stata and affect all do-files and all programs run in the future. Do not do that unless you are the only one running your code and you run your code exclusively on your own machine/copy of Stata. If you want to disable variable abbreviations for your do-file(s)/programs/application, that is fine but let others make their own choices. Option permanently should probably not be used here.

              Second, coding

              Code:
              set varabbrev off
              ...
              set varabbrrev on
              is a bad idea. The code will conclude allowing variable abbreviations irrespective of whether variable abbreviations were allowed before. Consider

              Code:
              set varabbev off // <- clever
              .. // <- many lines of code; perhaps calls to other do-filles
              set varabbrev off
              ...
              set varabbrrev on // <- bad idea, reverses the initial setting
              If you want to enable or disable variable abbreviations for certain lines of code, use novarabbrev or varabbrev as suggested in #2.
              Last edited by daniel klein; 19 Jun 2022, 00:55.

              Comment

              Working...
              X