Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adding double quotation marks to some of the variables

    Hi, this sounds like an intuitively simple task, while I got stuck.

    The goal is to add double quotation marks to some of the variables in the dataset with an excluding list.

    My code:

    Code:
    // Retrieve all variable list
    unab allvars : _all
    
    // Turn all variables into string
    foreach var of local allvars {
        capture confirm variable `var', string
        if _rc {
            tostring `var', replace
        }
    }
    
    // Trim all string variables
    foreach var of local allvars {
        replace `var' = trim(`var')
    }
    
    replace dob = "0" + dob if length(dob) == 7
    replace commencementdate = "0" + commencementdate if length(commencementdate) == 7
    replace cessationdate = "0" + cessationdate if length(cessationdate) == 7
    
    
    // Exclude the following variables (no quotation marks to be added)
    local exclude "episodeid dob dobstatus commencementdate cessationdate"
    
    
    foreach var of local allvars {
      if "`exclude'" != "`varname'" {
        gen temp_var = `"""' + `var' + `"""'
        replace `var' = temp_var if temp_var != "" 
        drop temp_var
      }
    }
    However, all variables had quotation marks added. It seems the "local exclude..." doesn't work.

    Sample data:

    forum test.dta

    Thank you.
    Attached Files
    Last edited by Shen YANG; 30 Oct 2023, 00:57.

  • #2
    I also tried to change:
    if "`exclude'" != "`varname'"
    to
    if "`exclude'" !: "`varname'"
    but got "type mismatch" error.

    Comment


    • #3
      Do not include attachments. Always use dataex to present data examples (see FAQ Advice #12). You cannot add double quotes to numerical variables - and that most likely explains the "type mismatch" error. One way to ignore numerical variables without explicitly spelling them out is to use capture. See

      Code:
      help capture
      But ds can select only string variables:

      Code:
      ds, has(type string)
      display "`r(varlist)'"
      Having said that, here is some technique:

      Code:
      sysuse auto, clear
      keep make mpg gear rep78 length
      foreach var in mpg gear{
          tostring `var', replace force
      }
      
      local toexclude mpg rep78
      
      foreach var of varlist*{
          if !ustrregexm("`toexclude'", "(\b`var'\b)"){
              cap replace `var'= `"""'+ `var'+`"""'
          }
      }
      Res.:

      Code:
       l in 1/20, sep(0)
      
           +------------------------------------------------------------+
           | make                  mpg   rep78   length      gear_ratio |
           |------------------------------------------------------------|
        1. | "AMC Concord"          22       3      186   "3.579999924" |
        2. | "AMC Pacer"            17       3      173   "2.529999971" |
        3. | "AMC Spirit"           22       .      168   "3.079999924" |
        4. | "Buick Century"        20       3      196   "2.930000067" |
        5. | "Buick Electra"        15       4      222   "2.410000086" |
        6. | "Buick LeSabre"        18       3      218   "2.730000019" |
        7. | "Buick Opel"           26       .      170   "2.869999886" |
        8. | "Buick Regal"          20       3      200   "2.930000067" |
        9. | "Buick Riviera"        16       3      207   "2.930000067" |
       10. | "Buick Skylark"        19       3      200   "3.079999924" |
       11. | "Cad. Deville"         14       3      221   "2.279999971" |
       12. | "Cad. Eldorado"        14       2      204   "2.190000057" |
       13. | "Cad. Seville"         21       3      204    "2.24000001" |
       14. | "Chev. Chevette"       29       3      163   "2.930000067" |
       15. | "Chev. Impala"         16       4      212   "2.559999943" |
       16. | "Chev. Malibu"         22       3      193   "2.730000019" |
       17. | "Chev. Monte Carlo"    22       2      200   "2.730000019" |
       18. | "Chev. Monza"          24       2      179   "2.730000019" |
       19. | "Chev. Nova"           19       3      197   "2.559999943" |
       20. | "Dodge Colt"           30       5      163   "3.539999962" |
           +------------------------------------------------------------+

      Comment


      • #4
        Originally posted by Shen YANG View Post
        The goal is to add double quotation marks to some of the variables in the dataset with an excluding list.
        . . . However, all variables had quotation marks added. It seems the "local exclude..." doesn't work.
        Use the macrolist in syntax.
        Code:
        help macrolist
        for more information.

        I illustrate its use using your attached dataset (downloaded and placed in the working directory), and I include all of your other data management tasks on your dataset that you show in your example code.
        Code:
        version 18.0
        
        clear *
        
        use "forum test"
        
        *
        * Begin here
        *
        quietly tostring _all, replace // automatically skips strings
        
        foreach var of varlist dob commencementdate cessationdate {
            quietly replace `var' = "0" if ustrlen(`var') == 7
        }
        
        local exclude episodeid dob dobstatus commencementdate cessationdate
        
        foreach var of varlist _all {
        
            quietly replace `var' = ustrtrim(`var')
        
            local skip : list var in exclude
            if `skip' continue
        
            quietly replace `var' = uchar(34) + `var' + uchar(34) if !mi(`var')
        }
        
        exit
        Notes:

        1. You don't need to get a list of string variables; tostring automatically skips converting the variable if it's already a string.

        2. You don't need to enclose your exclude list in double quotation marks.

        3. If you're working with Unicode strings in your real dataset, then you'll be better off using the Unicode versions of the string functions where possible.

        Comment


        • #5
          Ah, it would've been better to have put the trimming step up front so that the zero-replacement step would act on trimmed dates of birth etc.
          Code:
          version 18.0
          
          clear *
          
          use "forum test"
          
          *
          * Begin here
          *
          quietly tostring _all, replace // automatically skips strings
          
          // First
          foreach var of varlist _all {
              quietly replace `var' = ustrtrim(`var')
          }
          
          // And then
          foreach var of varlist dob commencementdate cessationdate {
              quietly replace `var' = "0" if ustrlen(`var') == 7
          }
          
          local exclude episodeid dob dobstatus commencementdate cessationdate
          
          foreach var of varlist _all {
          
              // quietly replace `var' = ustrtrim(`var')
          
              local skip : list var in exclude
              if `skip' continue
          
              quietly replace `var' = uchar(34) + `var' + uchar(34) if !mi(`var')
          }
          
          exit

          Comment


          • #6
            Much appreciated the insightful replies by Andrew Musau and Joseph Coveney. Cheers!

            Comment

            Working...
            X