Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is there a way to show local/global values in your logs?

    For example, in the following code fragment ...

    Code:
    sysuse auto, clear
    local x "turn"
    sum price `x'
    ... is there a way to produce a log file that expands the x into "turn" and thus shows ...
    Code:
    sum price turn
    at some point?

    The reason I ask is that I need to find the differences between my code and a replication package which uses so many locals and globals that even just figuring out in which dofile a particular data file is produced is nigh impossible. The idea I had was to run the entire package with a log open and tracing on and then search the log rather than the code, but it doesn't show the macro values.

    PS: I could of course add display sum price `x' style lines before each local usage, but the replication package is too large for that to be feasible unless I automatically generate the code.

  • #2
    I fear the short answer is No. You could set trace on but the your logs would typically explode in size.

    The question raises three points indirectly.

    1. Programming style is notoriously personal, but one style for local and global macros is to gather definitions early and together and presumably those who like this style could underline that they did this, so you have one obvious place to look. In contrast, my own preference is to define local macros just before they are needed. So look upwards in the log.

    2. I see many examples of use of local macros that are not needed at all. Rather than cite anybody's code, here is a parody

    l
    Code:
    ocal  possiblepets cat dog dinosaur 
    
    foreach pet of local possiblepets { 
         di "one possible pet is `pet'"
    }
    whereas if that is it -- and you need to use that local macro just once -- then you don't need it at all.

    Code:
    foreach pet in cat dog dinosaur { 
    
    }
    To the objection that that style wires certain choices into the code, I clearly agree, but it does so more concisely than the alternative.

    Again, I've often called up this homely analogy

    I have a pen.

    I put it in a box

    I need my pen.

    I take my pen out of the box.

    I now have my pen again.

    The box is the local macro. In coding you need a reason to put stuff in a box. If the reason is that defining once allows many uses, that's good.

    A variant on the same thing is people picking up say r class and e class results, then putting them in a local macro, and then displaying them. Where did that come from? (There could even be a small loss of precision doing that.)

    I call the sometimes needed surgery cutting out the middle macro.

    3. Many active people here are negative about global macros, and me too, although Clyde Schechter is #1. A global macro by definition could be defined in any space that is visible to Stata, so a log could easily not contain a definition at all. This to me is a standard but compelling objection to global macros.

    Comment


    • #3
      Originally posted by Jesse Wursten View Post
      PS: I could of course add display sum price `x' style lines before each local usage, but the replication package is too large for that to be feasible unless I automatically generate the code.
      Going on this, you could identify locals by the presence of the grave accent. Here is some technique:

      Code:
      *DO FILE EXAMPLE
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str24 v1
      "local x turn"            
      "local z disp"            
      "local b mpg"             
      "sysuse auto, clear"      
      "sum price `x'"           
      "regress price mpg weight"
      "margins, dydx(`b')"      
      "graph bar, over(rep78)"  
      end
      
      *USE IMPORT DELIMITED TO IMPORT IT AS A TEXT FILE
      *import delimited "myfile.do", clear
      g v0=""
      replace v0= "display "+`" ""' + v1 + `" ""' if  ustrregexm(v1, "`")
      g obsno=_n
      reshape long v, i(obsno) j(which)
      drop if missing(v)
      sort obsno which
      keep v
      Res.:

      Code:
      . l, sep(0)
      
           +--------------------------------+
           |                              v |
           |--------------------------------|
        1. |                   local x turn |
        2. |                   local z disp |
        3. |                    local b mpg |
        4. |             sysuse auto, clear |
        5. |      display  "sum price `x' " |
        6. |                  sum price `x' |
        7. |       regress price mpg weight |
        8. | display  "margins, dydx(`b') " |
        9. |             margins, dydx(`b') |
       10. |         graph bar, over(rep78) |
           +--------------------------------+

      Comment


      • #4
        There is an interesting widespread convention that -- in my experience -- most Stata programmers write local and not loc or loca.

        Indeed, programmers who long since learned about local macros may not have realised even that the command could be abbreviated, and (me too) I had to look up just now precisely what is allowed.

        I commend that use of the full name -- with the side-effect that search for local is almost always precisely what is needed.

        local is not too long to type and evocative in a way that loc and loca fail to be, or so I suggest.

        (Another broad consensus is that gen is good for generate but g is too cryptic. I have one friend who writes gene but he was a proper biologist in earlier life.)

        Comment


        • #5
          I used to write gen, but I have picked up a bad habit after seeing g floating around!

          Comment


          • #6
            One might expand on #3 adding some mark of the "display" records , like:
            Code:
            replace v0= "display "+`" ""' + v1 + `" //expanded//""' if  ustrregexm(v1, "`")
            and use this for some further processing, probably not like the following terrible hack which might not scale:
            Code:
            clear
            input str24 v1
            "local x turn"            
            "local z disp"            
            "local b mpg"             
            "sysuse auto, clear"      
            "sum price `x'"           
            "regress price mpg weight"
            "margins, dydx(`b')"      
            end    
            
            tempfile text
            
            outfile using "`text'", replace noquote 
                
            import delimited using "`text'", clear delim("ยงยงยง")
            generate v0=""
            
            replace v0= "display "+`" ""' + v1 + `" //expanded//""' if  ustrregexm(v1, "`")
            
            generate obsno=_n
            reshape long v, i(obsno) j(which)
            drop if missing(v)
            sort obsno which
            keep v
            
            ****************************************************************************************
            
            outfile using "`text'", replace noquote  
            
            quietly {
            log using test , replace text
            noisily do  "`text'" // exclude 
            log close
            }
            
            mata : 
            
            f = cat("C:\Users\baa\stata\test.log")  
            f = select(f, regexm(f, "^\.|//expanded//"))
            f = select(f, !regexm(f, "// exclude")) 
            f = select(f, !regexm(f, "display.*//expanded//")) 
            f = select(f, !ustrregexm(f,"^\s*\.\s*$")) 
              
            end 
            
            clear
            getmata f 
            drop if strpos(f[_n-1],"//expanded//")
            replace f = subinstr(f,"//expanded//","",1)
            
            tempfile test
            outfile using "`test'", replace noquote  
            
            type "`test'" //
            Code:
            . local x turn
            . local z disp
            . local b mpg
            . sysuse auto, clear
            sum price turn
            . regress price mpg weight
            margins, dydx(mpg)

            Comment


            • #7
              What I do, is I put all my local macros that I ever define in the do file at the top lines of my do file. That way, everything that follows is a byproduct of the code and the macros are easy to follow.

              The only time I use globals is when I define a master pathway for a project (i.e., a replication package), but other than that I only use locals.

              Comment


              • #8
                I'm not fluent enough with regular expressions to do what follows myself, but going back to Andrew Musau 's idea at #3, couldn't one read in the whole do-file as a strL, make an appropriate call to -ustrregexra- (or such) to change it, and then write out the strL to a file?
                Code:
                clear
                set obs 1
                gen strL s = fileread("olddofile.do")  // up to the 2e9 byte limit of a strL
                replace s = ustrregexra(s, ......)
                gen b = filewrite("newdofile.do", s)
                Perhaps creating the right regex would be too difficult this way, but if not, it would demonstrate my view that -fileread()- and -filewrite- deserve more use and respect. <grin>

                Comment


                • #9
                  The approach offered by Mike Lacy in #8 is quite workable if you know ahead of time what to search for and what to replace it with. I used it myself once in a rather grand fashion when a change in the architecture of a network I was using required me to change certain path names in the hundreds of do-files on a particular server. In conjunction with Robert Picard's -filelist- (available on SSC), it was just a few short lines of code and it executed in the blink of an eye.

                  But I don't have the sense that this is applicable to O.P.'s question here, because I think the problem is that he doesn't even know precisely what text he's looking for. At best he thinks it will mostly require searching for mentions of -local-. But he is not certain even of that because there are other ways that local macros can be created. And as Nick pointed out, globals might even be defined in other files altogether!

                  Comment


                  • #10
                    Thank you all for your input. I ended up writing a small python script that does more or less the same as #6 but then for an entire folder of dofiles and catches globals as well. It also handles switches between cr and # delimiting. I've pasted it below for anyone interested. It seems to work for my usecase but there might be some edgecases that trip it up (e.g. /// splits).

                    As for the use of locals and globals, my personal opinion is that they are useful when they clarify intent (e.g. if it's not obvious that you are looping over potential pets) or refer to often used constants (e.g. a list of all US states or root directories). In any case, they should have obvious names and preferably no abbreviations. But when your key regression looks like this ${rtype`s'} `Y'countpc ${treatbefore} one if year>=`b' & cleansample==1 ${weight`Y'`b'} , a(${a`s'} ${control} ${controlafter} ${controlf} ${window} ) cluster(statenum)"' where even the command is a macro, something has gone well and truly wrong ...

                    I have to admit I'm partial to using globals in personal work because they play nicely with line-by-line execution, which is always a bit of a pain with locals.

                    Code:
                    # Technicalities
                    #- Imports
                    import os
                    
                    #- Constants
                    MACROINDICATORS = ['$', '`']
                    
                    #- Folder containing dofiles
                    folder = r'path\to\folder' # Don't include a final slash
                    newFolder = os.path.dirname(folder) + os.sep + os.path.basename(folder) + '_new'
                    os.makedirs(newFolder, exist_ok=True)
                    
                    # Code
                    for filename in os.listdir(folder):
                        # DEBUG: filename = 'Appendix_Table_A1_A2.do'
                        pathIn = os.path.join(folder, filename)
                        pathOut = os.path.join(newFolder, filename)
                    
                        #- Load dofile
                        with open(pathIn, 'r') as file:
                            lines = file.readlines()
                    
                        #- Parse alternative delimiters
                        semiStart, semiEnd = None, None
                        filteredLines = []
                        for idx, line in enumerate(lines):
                            if '#delimit ;' in line:
                                semiStart = idx
                            elif '#delimit cr' in line:
                                semiEnd = idx
                                if semiStart != None:
                                    linesToUse = lines[semiStart+1:semiEnd-1]
                                    linesToUse = [line.replace('\n', '') for line in linesToUse]
                                    combinedLines = ' '.join(linesToUse).split(';')
                                    filteredLines.append(*combinedLines)
                    
                                    semiStart, semiEnd = None, None
                            elif semiStart != None:
                                pass
                            else: filteredLines.append(line)
                        
                        #- Add display lines
                        newFileLines = []
                        for line in filteredLines:
                            if any(macroIndicator in line for macroIndicator in MACROINDICATORS):
                                filteredLine = line.replace('\n', '')
                                newFileLines.append(f'display `"{filteredLine}"\'\n')
                            newFileLines.append(line)
                    
                        #- Save file
                        with open(pathOut, 'w') as file:
                            file.writelines(newFileLines)
                    Last edited by Jesse Wursten; 13 Dec 2022, 02:28.

                    Comment


                    • #11
                      ref #8:
                      Code:
                      clear
                      input str24 v1
                      "local x turn"                    
                      "local b mpg"            
                      "sysuse auto, clear"      
                      "sum price `x'"          
                      "regress price mpg weight"
                      "margins, dydx(`b')"      
                      end    
                      
                      tempfile text1 text2
                      outfile using "`text1'", replace noquote  
                       
                      display filewrite("`text2'", ustrregexra(fileread("`text1'"), ///
                         "(?m)^(.*\u0060[_\p{letter}0-9]+'.*)$",  /// flag + pattern
                          "display " + char(34) + "$0 //expanded//" + char(34) + char(10) + "$0" ) ) // replacement
                          
                      type "`text2'"
                      
                      sysuse auto , clear // test
                      do "`text2'"
                      Code:
                      type "`text2'"
                      
                      local x turn
                      local b mpg
                      sysuse auto, clear
                      display "sum price `x' //expanded//"
                      sum price `x'
                      regress price mpg weight
                      display "margins, dydx(`b') //expanded//"
                      margins, dydx(`b')
                      Matching globals:
                      Code:
                      clear
                      input str24 v1
                      "global x turn"                    
                      "global b mpg"            
                      "sysuse auto, clear"      
                      "sum price $x"          
                      "regress price mpg weight"
                      "margins, dydx(${abc}`b')"      
                      end    
                      
                      tempfile text1 text2
                      outfile using "`text1'", replace noquote  
                       
                      * 1. define pattern to match at least one global ($name) in line
                       
                      #delim ;
                      
                      local re // to be used in ustrregexra(S1, re, S2)
                      
                      (?mx)  (?# flag m multiline ^$ matches start/stop line
                                 flag x x-mode: allow white space and comments )
                                      
                          ^                               (?# start of string)    
                          .*                              (?# any char zero or more times)
                          \u0024                          (?# Dollar Sign)
                          [\p{letter}\u0060{]             (?# set of first char)
                          [\p{letter}0-9_{}'\u0060]{0,32} (?# set of valid-name-chars and {}`' 132 times)
                          .*                              (?# any char zero or more times)
                          $                               (?# end of string)  
                          
                      ; /* end assignment to local re */ #delim cr
                      
                      * 2. search/replace using ustrregexra(S1, re, S2)
                      
                      di  filewrite( ///
                              "`text2'", ///
                              ustrregexra( ///
                                  fileread("`text1'"), /// s1: text
                                  "`re'",              /// re: flag + pattern to match
                                  "display "           /// s2: replacement
                                  + char(34)           /// s2: replacement
                                  + "$0 //expanded//"  /// s2: replacement
                                  + char(34)           /// s2: replacement
                                  + char(10)           /// s2: replacement
                                  + "$0"               /// s2: replacement
                              ) ///
                          )  
                          
                      type "`text2'"
                      Code:
                      global x turn
                      global b mpg
                      sysuse auto, clear
                      display "sum price $x //expanded//"
                      sum price $x
                      regress price mpg weight
                      display "margins, dydx(${abc}`b') //expanded//"
                      margins, dydx(${abc}`b')
                      Last edited by sladmin; 14 Dec 2022, 07:23. Reason: updated per user request

                      Comment

                      Working...
                      X