Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Use a substring of a variable name to refer to likewise named variables

    Dear forum,
    I have a number of datasets, each consisting of various years. On each file I run the same DiD regressions. The files have identical structure and only differ in their interaction variables' naming. For example, a data set for the years 2004-2009 with DiD from 2007 has variables named ocpagr2007, ocpind2007 etc.

    I want Stata to infer the year of the interaction from the variable name. The following has failed:
    Code:
    ds ocpagr2*
    local interaction_year substr(r(varlist), 7, 8)
    display `interaction_year'
    
    /*Some other code
    ..
    ..*/
    
    reg lrealhrlywage age agesqr schooly schoolysqr married male muslim ocpagr* ocpcns* ocpind* i.yearsur [pweight=weight] if occup1d=="9" & outlier==0
    
     test ocpagr = ocpagr`interaction_year'
    I have received the following error:
    Code:
    unknown function ocpagrsubstr()
    r(133);
    Despite the fact that the "display `interaction_year'" above does print "2007". Also, when I manually write the bottom line as " test ocpagr = ocpagr2007", it works.
    I suspect that the macro doesn't store the literal value of "r(varlist)", and so the value is changed/"emptied" after I go on with running some other code.

    I did a fare share of googling but to couldn't figure out how to overcome this issue. Thanks a lot!
    Best,
    Yuval

  • #2
    You have a function within the local. Consider this

    Code:
    local myval substr("val1936", 4, 4)
    di `myval'
    di `"`myval'"'
    which gives you

    Code:
    . di `myval'
    1936
    
    . di `"`myval'"'
    substr("val1936", 4, 4)
    This forces the function to be evaluated.

    Code:
     di `=`myval''
    1936
    So you want

    Code:
    test ocpagr = ocpagr`=`interaction_year''

    Comment


    • #3
      Dear Andrew,
      Thanks for your help. Writing "`=`interaction_year''" gave rise to a new "type mismatch" error:
      Code:
      type mismatch
      
       ( 1) = 0
             Constraint 1 dropped
      
             F(  0,  5606) =       .
                  Prob > F =         .
      I tried figuring out what is the type of my local macro through various variations of "display "`: type interaction_year'" but it didn't work. So has failed applying tostring/destring to the macro through the following manner:
      Code:
      local interaction_year destring/tostring substr(r(varlist), 7, 8), replace
      I would be grateful for more words of advice
      Thanks

      Comment


      • #4
        Sorry, but I don't think #3 makes anything very clear. Apart from a specific point below, the advice it allows is just generic. I guess you're bringing to Stata a background in programming other languages, which is often helpful and often a source of distraction.

        But a local macro is just a string with a name. The interpretation of e,g, local macro containing say "42" as numeric is in the mind of the programmer and not part of the definition. You need to evaluate such a macro before its contents can be treated as numeric,

        What you put in your local macro is not something I would see experienced Stata programmers ever doing putting in a local macro.

        I want to be more constructive but I think you need to show much more of your code. A very wild guess is that you are trying to program what Stata will do for you.

        Comment


        • #5
          Thank you Nick, sorry I wasn't clear enough. My question does feel silly as I haven't found any other threads with a similar problem...
          This is the code, I painted the relevant parts in green:
          Code:
          //fetching interaction year:
          ds ocpagr2*
          local interaction_year substr(r(varlist), 7, 8)
          display `interaction_year'
          
          local slist1 "agr" //first we only include the agriculture sector
          local slist2 "`slist1' ind cns" //including the industrial and construction sectors
          local snumber=2 //number of sector lists
          local regvars "lrealhrlywage age agesqr schooly schoolysqr married male muslim `slist`i'reg' i.yearsur [pweight=weight]" //all the regression
          
          forvalues i = 1/`snumber' { //range contains the above sector variables (currently, slist1 & slist2)
           local slist`i'reg `slist`i'reg'
           foreach j of local slist`i' {
            local slist`i'reg "`slist`i'reg' ocp`j'*" //the * signs refers both to the sector variable and its intercation variable
           }
           
           display upper("`i'. regressions controling for the occupations: `slist`i''")
           
           //regressions
           //whenever using the test command, not writing the regression explicitly will produce an error. Thus I can't use the macro `regvars'
           display "`i'.1. Non-professionals"
           reg lrealhrlywage age agesqr schooly schoolysqr married male muslim `slist`i'reg' i.yearsur [pweight=weight] if occup1d=="9" & outlier==0
           
           test ocpagr = ocpagr`=`interaction_year''
           if `i'>1 {
            test ocpind = ocpind`=`interaction_year''
            test ocpcns = ocpcns`=`interaction_year''
            estimates store m1 //results are stored for display through the "estout" command
           }
           
           display "`i'.2. Professionals ALL"
           reg lrealhrlywage age agesqr schooly schoolysqr married male muslim `slist`i'reg' i.yearsur [pweight=weight] if (occup1d=="5"|occup1d=="6") & outlier==0
          test ocpagr = ocpagr`=`interaction_year''
           if `i'>1 {
            test ocpind = ocpind`=`interaction_year''
            test ocpcns = ocpcns`=`interaction_year''
            estimates store m2
           }
           
           if `i'>1 {
            esttab m1 m2, r2 nonumbers mtitles("Non-skilled" "Skilled") //without stating which models to display, write before "reg" command eststo: to store
           }
          (You're welcome to ignore the comments inside the code, I wrote them for future me..)
          The test is for checking the significance of the difference between a variable (ocpagr) and an interaction variable (ocpagr=`interaction_year', in our case ocpagr2006 - working in agriculture from 2006 onwards).

          I hope I haven't missed any other relevant stuff.
          Thanks a lot

          Comment


          • #6
            Consider first the following examples.
            Code:
             sysuse auto, clear
            (1978 Automobile Data)
            
            . ds rep*
            rep78
            
            . local y substr(r(varlist), 4, 2)
            
            . macro list _y
            _y:             substr(r(varlist), 4, 2)
            
            . local y = substr(r(varlist), 4, 2)
            
            . macro list _y
            _y:             78
            The first example mimics the start of the code in post #5. I use macro list to show the contents of the local macro y because the display command tries too hard to completely evaluate what it is displaying. We see that the contents of the local macro y is as Andrew pointed out in post #2 a piece of code that needs to be evaluated, not the actual year in question.

            The second example shows what happens when you use an "=" in setting a local macro. It then expects an expression to the right of the "=" and it evaluates that expression and places the result into the local macro y. This is what you want to do.

            Now consider the following example.
            Code:
             sysuse auto, clear
            (1978 Automobile Data)
            
            . ds rep*
            rep78
            
            . return list
            
            macros:
                        r(varlist) : "rep78"
            
            . quietly regress price length
            
            . return list
            
            scalars:
                          r(level) =  95
            
            matrices:
                          r(table) :  9 x 2
            We see that the local macro r(varlist) created by ds is lost when the regress command creates its own set of return values. That is what happened in your code. The general principle is that you want to evaluate references to return results as close to the command that created them as possible, before they get overwritten by other commands that return results.

            Your code in post #5 needs to be something like the following.
            Code:
            //fetching interaction year:
            ds ocpagr2*
            local interaction_year = substr(r(varlist), 7, 8)
            display `interaction_year'
            
            ...
            
            test ocpagr = ocpagr`interaction_year'
             if `i'>1 {
              test ocpind = ocpind`interaction_year'
              test ocpcns = ocpcns`interaction_year'
              estimates store m1 //results are stored for display through the "estout" command
             }
             
             display "`i'.2. Professionals ALL"
             reg lrealhrlywage age agesqr schooly schoolysqr married male muslim `slist`i'reg' i.yearsur [pweight=weight] if (occup1d=="5"|occup1d=="6") & outlier==0
            test ocpagr = ocpagr`interaction_year'
             if `i'>1 {
              test ocpind = ocpind`interaction_year'
              test ocpcns = ocpcns`interaction_year'
              estimates store m2
             }
            
            ...

            Comment


            • #7
              Thank you for your fantastic explanation, William. It works great. Take care

              Comment

              Working...
              X