Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Labelling subset of variables based on prefix using varname / varlabel variables within the dataset - RUN WON'T COMPLETE

    Hi, I am trying to label a large number of variables using variable names and labels which are present as variables within the dataset (merged from a different dta). I am labelling different subsets of the dataset at a time based on prefixes contained within the variable names. However, when I run the code, it does not complete - there is no error, it just never stops showing the red cross in the command screen. I have tried leaving it for a long time.

    I was hoping someone could help me in either identifying the issue in my code, or suggesting a less computationally intensive way of doing this. The code ran and labelled variables before I added the code relating to filtering by prefix.

    EXAMPLE DATASET
    Code:
    clear
    input e_var1 e_var2 e_var3 c_var1 c_var2 c_var4 e_var5 str10 varname_EVAR str10 varlabel_EVAR str10 varname_CVAR str10 varlabel_CVAR
    1 1 1 1 1 1 1 var1 evar1lab var1 cvar1lab
    1 1 1 1 1 1 1 var2 evar2lab var2 cvar2lab
    1 1 1 1 1 1 1 var3 evar3lab var4 cvar4lab
    1 1 1 1 1 1 1 var5 evar5lab
    1 1 1 1 1 1 1
    end
    MY CODE:
    Code:
    foreach study in "e" "c" {
        local studyprefix = "`study'"
         local studyid = cond("`studyprefix'" == "e", "EVAR", cond("`studyprefix'" == "c", "CVAR", "inhouse")) // studyid local to link to study prefix (ignore inhouse, relevant for real dataset)
    
        ** label var
        local i 1 // i is obs number
        ds `study'*
        local varlist_`studyid' `r(varlist)'
        while !missing("=varname_`studyid'[`i']") {    
            foreach var of varlist `varlist_`studyid'' {
                local variablename = "`var'"
                local variablename_nopre = regexr("`variablename'",".*_","")
                if varname_`studyid'[`i'] == "`variablename_nopre'" {
                    lab var `var' "=varlabel_`studyid'[`i']"
                }
            }
            local ++i
        }
    }
    I would be extremely grateful for your help! Thanks.
    Last edited by Jack Treliving; 10 Feb 2025, 05:28.

  • #2
    I can't easily follow what you are trying to do here.

    Code:
    local studyprefix = "`study'"
    local studyid = cond("`studyprefix'" == "e", "EVAR", cond("`studyprefix'" == "c", "CVAR", "inhouse"))
    seems to boil down to

    Code:
    local studyid = upper("`study'") + "VAR"
    More crucially, your references to
    Code:
      
     "=varlabel_`studyid'[`i']"
    should more likely be to
    Code:
    "`=varlabel_`studyid'[`i']'"
    The logic of the while loop escapes me, however. varlabel_`studyid'[`i'] is or should be the value of a variable in the current observation. If it's missing when the loop is entered, when would that change inside the loop? Perhaps that is just another way to state the very problem you're reporting.

    Comment


    • #3
      I suggest backing up, showing us through an example like #1, what you have, what you want, and the rules connecting the two.

      The flavour that I am getting is that

      1. You have done a fair amount of programming in other languages, not so much in Stata. It can be good, also distracting, to know about programming in general but not so much about programming in Stata. Often simple tasks in Stata boil down to a few basic commands, not so much loops written ab initio.

      2. This may be something like a slightly non-standard reshape. Metadata and data together can be confusing.
      Last edited by Nick Cox; 10 Feb 2025, 07:37.

      Comment


      • #4
        I think your problem is incorrect code for macro expansion, as suggested in #2. Just put in quotes at two places above:
        Code:
            while !missing("`=varname_`studyid'[`i']'") {
        and
        Code:
                        lab var `var' "`=varlabel_`studyid'[`i']'"

        Comment


        • #5
          Hi Nick, thanks so much for your reply.

          In response to your suggestions:

          1. "VAR" suffix

          Code:
           
           local studyid = upper("`study'") + "VAR"
          The "VAR" suffix was actually a simplification of the real suffixes, which are different for each group - an unhelpful oversimplification on my part! I appreciate your attempts to make the code more concise though!

          2. Missing `'

          Code:
           
           "`=varlabel_`studyid'[`i']'"
          Thank you for spotting that mistake! I had included them in a previous version of the code, which also faced the same issue of endless running, unfortunately.

          3. while loop
          With the while loop, I am attempting to cycle through each value of the variable varname_`studyid' and check it against the name (prefix removed) of each variable within the study, as filtered using the study prefix (from the study specific varlists). If they match, I then label that variable with the corresponding value label using the same observation of the variable varlabel_`studyid'.

          Below is a working example where study specificity is removed (i.e. one set of varnames/varlabels to label the entire dataset). It's worth noting, I need the study specificity in my labelling as different studies use the same variable names for different definitions, so to use one universal data dictionary (or varname/varlabel list) to label would lead to mislabelling:

          Code:
          clear
          input e_var1 e_var2 e_var3 e_var4 e_var5 str10 varname str10 varlabel
          1 1 1 1 1 var2 var2lab
          1 1 1 1 1 var1 var1lab
          1 1 1 1 1 var3 var3lab
          1 1 1 1 1 var5 var5lab
          1 1 1 1 1 var4 var4lab
          end
          
          local i 1 // i is obs number
          while !missing("`=varname[`i']'") {
               foreach var of varlist * {
                   local variablename = "`var'"
                  local variablename_nopre = regexr("`variablename'",".*_","")
                  if varname[`i'] == "`variablename_nopre'" ///
                      lab var `variablename' "`=varlabel[`i']'"
                  }
                   local ++i
              }
          Thank you again for your help!

          Comment


          • #6
            Thank you Hemanshu, I have fixed the macro expansion but the issue still remains. I appreciate your help though, thank you!

            Comment


            • #7
              That's strange. After the edits in #4, your code runs fine on my machine. What issue are you facing now?

              Code:
              clear
              input e_var1 e_var2 e_var3 c_var1 c_var2 c_var4 e_var5 str10 varname_EVAR str10 varlabel_EVAR str10 varname_CVAR str10 varlabel_CVAR
              1 1 1 1 1 1 1 var1 evar1lab var1 cvar1lab
              1 1 1 1 1 1 1 var2 evar2lab var2 cvar2lab
              1 1 1 1 1 1 1 var3 evar3lab var4 cvar4lab
              1 1 1 1 1 1 1 var5 evar5lab
              1 1 1 1 1 1 1
              end
              
              foreach study in "e" "c" {
                  local studyprefix = "`study'"
                   local studyid = cond("`studyprefix'" == "e", "EVAR", cond("`studyprefix'" == "c", "CVAR", "inhouse")) // studyid local to link to study prefix (ignore inhouse, relevant for real dataset)
              
                  ** label var
                  local i 1 // i is obs number
                  ds `study'*
                  local varlist_`studyid' `r(varlist)'
                  while !missing("`=varname_`studyid'[`i']'") {    
                      foreach var of varlist `varlist_`studyid'' {
                          local variablename = "`var'"
                          local variablename_nopre = regexr("`variablename'",".*_","")
                          if varname_`studyid'[`i'] == "`variablename_nopre'" {
                              lab var `var' "`=varlabel_`studyid'[`i']'"
                          }
                      }
                      local ++i
                  }
              }
              
              . d
              
              Contains data
               Observations:             5                  
                  Variables:            11                  
              ----------------------------------------------------------------------------------------------------------------------------------
              Variable      Storage   Display    Value
                  name         type    format    label      Variable label
              ----------------------------------------------------------------------------------------------------------------------------------
              e_var1          float   %9.0g                 evar1lab
              e_var2          float   %9.0g                 evar2lab
              e_var3          float   %9.0g                 evar3lab
              c_var1          float   %9.0g                 cvar1lab
              c_var2          float   %9.0g                 cvar2lab
              c_var4          float   %9.0g                 cvar4lab
              e_var5          float   %9.0g                 evar5lab
              varname_EVAR    str10   %10s                  
              varlabel_EVAR   str10   %10s                  
              varname_CVAR    str10   %10s                  
              varlabel_CVAR   str10   %10s                  
              ----------------------------------------------------------------------------------------------------------------------------------
              Sorted by: 
                   Note: Dataset has changed since last saved.
              Last edited by Hemanshu Kumar; 10 Feb 2025, 11:15.

              Comment


              • #8
                Oh yes, you are correct - I have tried again and it's working now! I must have been mistaken. Thank you so much for all your help, I'm glad it was a simple fix! Thank you both

                Comment

                Working...
                X