Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • workflow for debugging/writing/re-running foreach

    Hi all,

    I have programmed almost entirely in object-oriented languages until recently beginning Stata and have some workflow practices that are not serving me well. I'm looking for some advice on how to go about writing foreach, which will also apply when I want to write stata programs.

    I'm writing a foreach that creates some count variables. In the "figuring it out" phase, the foreach won't always finish due to some syntax error or similar. In R, I would trace the error, make any necessary changes, and re-run the loop. This works in R because its default behaviour is to overwrite the contents of an object when the object is reassigned.

    In Stata, if I generate some vars in a foreach that doesn't finish, I need to drop all the vars that were created to that point in the foreach before I can re-run it after making changes. If I forget to drop one, it throws an error because "generate" can't be used once a particular var is already defined, and the cycle continues. How do more experienced Stata programmers work with this aspect of the language?

    Kind regards,

    Matt





  • #2
    You don't have a working example. But I guess STATA's capture command will help solve your problem.

    Comment


    • #3
      Regarding your object-oriented programming background, you might be interested in Mata, which is probably closer to what you have done. That is not to say that you want to program in Mata exclusively or even that you want to start with Mata.

      As for Stata, you will find that the manuals are very well written. Start skimming through [U] 18 Programming Stata and also the Stata Programming Reference Manual.

      Please provide a minimal example for more specific advice.


      Edit: Some advice based on personal experience: Given your background, you might be tempted to jump into Stata's classes. Don't. If you want class-style programming, it is better to look into Mata.
      Last edited by daniel klein; 26 Apr 2024, 01:57.

      Comment


      • #4
        You don't mention -replace-, but I assume you are aware of it. The distinction betwteen -generate- and -replace- is a misfeature that goes back to the origins of Stata, and isn't likely to be fixed now. It seems less of a burden in .do files, but is still a minor annoyance. Stata does have one feature to help with debugging - the -trace- option on the -set- command. It will help you determine which line in a -foreach- loop is causing a problem.

        Comment


        • #5
          The distinction between -generate- and -replace- is a misfeature that goes back to the origins of Stata, and isn't likely to be fixed now.
          That's an opinion, or if you like a considered personal judgment, but it is not a factual statement.

          I can't see that it was impossible or beyond the skill of the original programmers of Stata to have one single command that creates or modifies variables. After all, there have been community-contributed wrapper commands that do exactly that, although oddly none seems to have become popular.

          The asymmetry was deliberate: to make it hard to overwrite (often highly valuable) data, noting that replace can't be abbreviated. New variables that are useless can always be dropped or ignored. Messed up data usually have to be read in again.

          Sure, experienced users as well as new users can bump into the difficulty that you can't generate a variable that already exists, but I think that's saved users from many disasters.

          Comment


          • #6
            It's true that generate and replace go back to Stata's beginnings. What I am discussing is the argument that the distinction is a misfeature.

            Comment


            • #7
              As usual, I agree with what Nick says in #5 and #6. I'll just add that if you need to iteratively refresh variables inside a loop, and if -drop-ping them all at or before the end of the loop is undesirable for whatever reason, an alternative is to -gen-erate them all as missing values (or, if missing is, itself, informative in what you are doing, some other value that is easily distinguished from a valid value) before entering the loop, and then using just -replace- to update their values inside the loop.

              Comment

              Working...
              X