Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Keeping numbers in local macro and using them in variable names

    Colleagues,

    I'm working with a messy data set that broadly corresponds to the extract below:
    Code:
    // Data set
    clear
    set obs 100
    local i = 1
    while `i' <= 5 {
          gen messy_var_`i' = runiform()
          label variable messy_var_`i' ///
            `"Some nonsense and figure that I need `=floor((2012-2001+1)*runiform() + 2001)'"'
        local i = `i' + 1
     }
    Using the code below, I would like to keep numbers only in my local macro. Unfortuntaly, the years are coded as 2013.14 or 2014 or 20132014. Consequently, I would like to get first four figures from my macro below so I can use it in a variable name.

    Code:
     * Rename the variables in the varlit with last set of figures
    foreach ivar of varlist `r(varlist)' {
        * Put variable label to macro and keep year
        local varlbl : variable label `ivar'
        * It has to be like that as years have different lengths and formats.
        local vyear = substr("`varlbl'",-7,.)
        * ---- How to take first four figures only
        di "`vyear'"
    }
    Kind regards,
    Konrad
    Version: Stata/IC 13.1

  • #2
    You could probably just extract the last word from the label, then use the first four characters. You could also use regular expressions, though I am all but an expert on this

    Code:
    // Data set
    clear
    set obs 100
    local i = 1
    while `i' <= 5 {
          gen messy_var_`i' = runiform()
          label variable messy_var_`i' ///
            `"Some nonsense and figure that I need `=floor((2012-2001+1)*runiform() + 2001)'"'
        local i = `i' + 1
     }
     
     la var messy_var_2 "Some nonsense and figure that I need 2005.14"
     la var messy_var_3 "Some nonsense and figure that I need 20072014"
     
    * Rename the variables in the varlit with last set of figures
    foreach ivar of varlist * {
        local varlbl : variable label `ivar'
        
        // string function based
        loc foo = word("`varlbl'", -1)
        loc foo = substr("`foo'", 1, 4)
        
        // regular expression based 
        loc bar = regexm("`varlbl'", "(.)([0-9][0-9][0-9][0-9])")
        loc bar = regexs(2)
        
        di "`foo'"
        di "`bar'" _n
    }
    Best
    Daniel

    Comment


    • #3
      Daniel,

      Thank you very much for the valuable comment.
      Kind regards,
      Konrad
      Version: Stata/IC 13.1

      Comment


      • #4
        Daniel,

        Thank you very much for the valuable comment. Additional problem emerged, on the lines of this old discussion on the Stata listserver, I'm facing a problem where I'm trying to work with strings longer than 80 characters. I'm guessing that syntax local = do something is the problem here. How can I work around it?
        (...) local w = "pop young old male MSA countyhospice pctblack pctrural
        pctold charscore ipdays opdays tot_hos oppaid totalpaid"

        tells Stata to evaluate right-hand side and store it in w. Stata's
        expression parser however is limited to handling strings of 80 characters
        in Intercooled Stata. This is why your string was truncated. You should use
        instead

        . local w "pop young old male MSA countyhospice pctblack pctrural
        pctold charscore ipdays opdays tot_hos oppaid totalpaid" (..)
        Kind regards,
        Konrad
        Version: Stata/IC 13.1

        Comment


        • #5
          What version are you using that this bites? We are now on Stata 14 and versions in which this bites are receding into history. The FAQ Advice (soon to be updated) says to tell us the version you are using if it is not the current version.

          But it's multiply documented. Indeed the thread you cite says don't evaluate, just copy. Otherwise you are reporting a problem without showing us the exact code you used. After 411 posts you should know not to do that!

          Comment


          • #6
            I think you [that is: Konrad] are confusing things. Variable labels are (and continue to be) limited to 80 characters. This poses no problems for string expressions in any version of Stata. The limit for strings (string expressions) used to be 244 before Stata 13. It is now something around 2 billion.

            Can you be more specific about the problems you are facing? Can you even give an example that we can replicate? In general, there are two potential work arounds for problems with string functions:

            1. if an extended function exists, use the extended function instead
            2. use Mata

            Best
            Daniel

            Comment


            • #7
              I'm on Stata 13. With respect to the problem, if I get run the code to get the years:
              Code:
              foreach ivar of varlist `r(varlist)' {
                  * Put variable label to macro and keep year
                  local varlbl : variable label `ivar'
                  * Regural expression based from Stata forum
                  loc bar = regexm("`varlbl'", "(.)([0-9][0-9][0-9][0-9])")
                  loc bar = regexs(2)
                  * Show the results.
                  di "`varlbl'"
                  di "`bar'"
                  rename `ivar' `ivar'_`bar'
              }
              I'll get the following problem for some strings, I'm guessing that the problem is associated with long variable labels
              Code:
              General Fund Net Revenue Expenditure, Roads and transport (£000s)_2013
              2013
              General Fund Net Revenue Expenditure, Environmental services (£000s)_2013
              2013
              invalid number, outside of allowed range
              General Fund Net Revenue Expenditure, Planning and economic development (£000s)_

              Edit:
              OK, I didn't notice that I cut some labels in a previous do-file. So it's just a matter of solving that.
              Last edited by Konrad Zdeb; 09 Apr 2015, 04:09. Reason: Comments
              Kind regards,
              Konrad
              Version: Stata/IC 13.1

              Comment


              • #8
                OK, I've a last question concerning this subject. I can access the strings I need via the char command. How can I pass the text output from char to macro:
                Code:
                . char list Capital_Income3[original_text]
                  Capital_In~3[original_text]:
                                              Some text that can be later used when renaming_2013
                . local test : char list Capital_Income3[original_text]
                variable list not found
                r(111);
                In effect, if this is working the rest of the stuff will work fine. It won't matter what is in the variable label.
                The problem is solved via:
                Code:
                local varlbl  "``ivar'[original_text]'"
                Apologies for rather chaotic character of my last post posts.
                Last edited by Konrad Zdeb; 09 Apr 2015, 04:38. Reason: Solution.
                Kind regards,
                Konrad
                Version: Stata/IC 13.1

                Comment


                • #9
                  Note that

                  Code:
                  local varlbl : char `ivar'[original_text]
                  will also work.

                  Best
                  Daniel

                  Comment

                  Working...
                  X