Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • loop in parallel with tokenize, comma as delimiter

    I'm currently using Stata for data management. Inspired by Nick Cox' (2021) paper "Speaking Stata: Loops in parallel", I'm experimenting with -tokenize-.

    The first chunk of code below seems to work as expected. But I thought I would be able to ask -tokenize- to separate "tokens" by comma as a delimiter.
    (The reason for not simply using single spaces between each word/token, is that I want the words in the local macros to line up, as shown below, making it easy to check that my code is correct. Also, a clarification: I intenionally avoid retaining the original variable names as I never modify the original data. So, I use -rename- rather than -clonevariable-.)


    Code:
    // Define local macros of item names, one for Norwegian and one for English
    local nor_items "alene, gange, hjelp_1, hjelp_2, hjelp_3, hjelp_4, hjelp_5"
    local eng_items "alone, dist,  Nclean,  Ngroc,   Nrep,    Ndigi,   Nbed"
    
    foreach parent in mo fa {
        // Loop through the items
        while "`nor_items'" != "" & "`eng_items'" != "" {
            // Get the current token from each macro
            gettoken nor_item nor_items: nor_items
            gettoken eng_item eng_items: eng_items
    
            // Remove the comma from the current tokens
            local nor_item = subinstr("`nor_item'", ",", "", 1)
            local eng_item = subinstr("`eng_item'", ",", "", 1)
    
            // Rename variables
            fre `parent'r`nor_item'
            rename `parent'r`nor_item' `parent'_`eng_item' 
            
            // Recode missing values
            recode `parent'_`eng_item' (9996/9999 = .)
        }
    }
    I'm looking at the manual for -tokenize- (https://www.stata.com/manuals/pgettoken.pdf) and do find the option parse().

    But this code does not work:

    Code:
    // Define local macros of item names, one for Norwegian and one for English
    local nor_items "alene, gange, hjelp_1, hjelp_2, hjelp_3, hjelp_4, hjelp_5"
    local eng_items "alone, dist,  Nclean,  Ngroc,   Nrep,    Ndigi,   Nbed"
    
    foreach parent in mo fa {
        // Loop through the items
        // TRYING TO AVOID THIS PART while "`nor_items'" != "" & "`eng_items'" != "" {
            // Get the current token from each macro
            gettoken nor_item nor_items: nor_items, parse(",")
            gettoken eng_item eng_items: eng_items, parse(",")
    
            // TRYING TO AVOID THIS PART Remove the comma from the current tokens
            // local nor_item = subinstr("`nor_item'", ",", "", 1)
            // local eng_item = subinstr("`eng_item'", ",", "", 1)
    
            // Rename variables
            fre `parent'r`nor_item'
            rename `parent'r`nor_item' `parent'_`eng_item' 
            
            // Recode missing values
            recode `parent'_`eng_item' (9996/9999 = .)
        }
    }
    Stata responds:

    Code:
    variable far not found
    I'd prefer Stata for more complex data management, given the simplicity of its code for this task. But my knowledge of Stata code is quite limited.
    Yes, the first code works. But can the code be simplified, in the interest of readers when the code is published? Any help would be appreciated.

    Even though Stata's error message does refer to "far", I suspect the challenge lies in my use of either one OR several spaces after the comma to separate words in the local macro.

  • #2
    I’m not sure if this is the solution but I don’t think that you need to use the quotes around the contents of the local when you originally define them. That might allow you to have the spacing that you want without having to worry about the spacing as the contents of the local are used.

    Comment


    • #3
      One error seems to be that I need to reset a local macro after a loop iteration. Like this:

      Code:
      local new_names `eng_items'

      Comment


      • #4
        Thanks, Lance Erickson . Indeed, a local macro does not requre quotes.
        But I believe I need commas (or some other character) to separate tokens. Unfortunately, running the code without quotes and without commas does not work since tokenize reads a space as a delimiter and two spaces as two delimiters.

        Comment


        • #5
          Second time around the loop -- for the second variable -- don't you need to reset the two macros with lists of items?

          Comment


          • #6
            Nick Cox Indeed, see #3.

            But that didn't help; Stata still stopped after the first item/token. I've received some help and now the code seems to work, looking like this:

            Code:
            local old_names "alene, gange, hjelp_1, hjelp_2, hjelp_3, hjelp_4, hjelp_5"
            local new_names "alone, dist,  Nclean,  Ngroc,   Nrep,    Ndigi,   Nbed"
            
            foreach parent in mo fa {
                local nor_items `old_names'
                local eng_items `new_names'
                
                // Loop through the items
                while "`nor_items'" != "" {
                    gettoken nor_item nor_items: nor_items, parse(",")
                    gettoken eng_item eng_items: eng_items, parse(",")
            
                    // Rename variables
                    local old_varname = "`parent'r`nor_item'"
                    local new_varname = "`parent'_`eng_item'"
                    
                    capture confirm variable `old_varname'
                    if _rc == 0 {
                        fre `old_varname'
                        rename `old_varname' `new_varname'
                        
                        // Recode missing values
                        recode `new_varname' (9996/9999 = .)
                        fre `new_varname'
                    }
                }
            }
            Not exactly like I planned, but much more intuitive than the original working code.
            Thanks for responses!

            Comment


            • #7
              Could you give a data example please?

              Comment


              • #8
                Of course, Nick! [I'm not aware of any Stata command that resembles R's dput(), but I now realised that subsetting is easy with _n.]
                Attached are data for the following variables:
                moralene morgange morhjelp_1 morhjelp_2 faralene fargange farhjelp_1 farhjelp_2 (8 variables, 10 observations).
                Attached Files

                Comment


                • #9
                  Please use dataex.

                  Comment


                  • #10
                    Data set genereted by dataex.
                    Attached Files

                    Comment


                    • #11
                      Thanks for posting data, but some misunderstanding here. dataex yields input code you can copy and paste into this software, as below. See FAQ Advice #12.

                      On your main problem I note that the items are all to be suffixes in variable names, so cannot contain spaces, so comma separation is not needed and is indeed a distraction. The is also some indirection in using more local macros than are needed. This seems to work but using gettoken and/or tokenize could work too.

                      Using fre (which must be installed from SSC) is also I guess a side-issue here for anyone interested in running the code, so here I use tabulate.

                      Code:
                      * Example generated by -dataex-. For more info, type help dataex
                      clear
                      input int(moralene morgange morhjelp_1 morhjelp_2 faralene fargange farhjelp_1 farhjelp_2)
                      1 9996 9996 9996 1 9996 9996 9996
                      1    1 9996 9996 1    1    1    3
                      3    5    1    1 3    5    1    1
                      2    2    1    1 2    2    5    5
                      2    2    1    1 2    2    1    1
                      2    2    1    1 2    2    1    1
                      3    3    1    1 .    .    .    .
                      1    2    1    1 1    2    2    1
                      2    4    1    1 2    4    1    1
                      2    4    1    1 2    4    1    1
                      end
                      label values moralene labels4
                      label def labels4 1 "Alene", modify
                      label def labels4 2 "Sammen med min far", modify
                      label def labels4 3 "Sammen med en annen ektefelle/samboer enn min far", modify
                      label values morgange labels6
                      label def labels6 1 "Gangavstand", modify
                      label def labels6 2 "Opptil 1 time med kjøring", modify
                      label def labels6 3 "Fra 1-2 timer med kjøring", modify
                      label def labels6 4 "Fra 2-4 timer med kjøring", modify
                      label def labels6 5 "Over 4 timer med kjøring", modify
                      label def labels6 9996 "na", modify
                      label values morhjelp_1 labels7
                      label def labels7 1 "Trenger ikke hjelp", modify
                      label def labels7 9996 "na", modify
                      label values morhjelp_2 labels8
                      label def labels8 1 "Trenger ikke hjelp", modify
                      label def labels8 9996 "na", modify
                      label values faralene labels113
                      label def labels113 1 "Alene", modify
                      label def labels113 2 "Sammen med min mor", modify
                      label def labels113 3 "Sammen med en annen ektefelle/samboer enn min mor", modify
                      label values fargange labels115
                      label def labels115 1 "Gangavstand", modify
                      label def labels115 2 "Opptil 1 time med kjøring", modify
                      label def labels115 4 "Fra 2-4 timer med kjøring", modify
                      label def labels115 5 "Over 4 timer med kjøring", modify
                      label def labels115 9996 "na", modify
                      label values farhjelp_1 labels116
                      label def labels116 1 "Trenger ikke hjelp", modify
                      label def labels116 2 "2", modify
                      label def labels116 5 "Trenger svært mye hjelp", modify
                      label def labels116 9996 "na", modify
                      label values farhjelp_2 labels117
                      label def labels117 1 "Trenger ikke hjelp", modify
                      label def labels117 3 "3", modify
                      label def labels117 5 "Trenger svært mye hjelp", modify
                      label def labels117 9996 "na", modify
                      
                      local old_names "alene, gange, hjelp_1, hjelp_2, hjelp_3, hjelp_4, hjelp_5"
                      local new_names "alone, dist,  Nclean,  Ngroc,   Nrep,    Ndigi,   Nbed"
                      
                      * start here 
                      local nor_items : subinstr local old_names "," " ", all 
                      local eng_items : subinstr local new_names "," " ", all
                      
                      foreach parent in mo fa {
                          
                          local j = 1 
                          foreach nor of local nor_items { 
                              
                              // Rename variables
                              capture confirm variable `parent'r`nor' 
                              if _rc == 0 {
                                   
                                  local eng : word `j' of `eng_items'  
                                  rename `parent'r`nor' `parent'r`eng'
                                  
                                  // Recode missing values
                                  recode `parent'r`eng' (9996/9999 = .)
                                  tab `parent'r`eng'
                                  
                              }
                              
                              local ++j 
                          }
                      }

                      Comment


                      • #12
                        Thanks a lot!

                        Comment

                        Working...
                        X