Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Appending Files in a Folder for Multiple Folders

    Hello All,

    I have multiple folders, each of which I would like to separately loop over and append all files in a given folder. I am trying to make a for loop to loop over all the folders, with a nested for loop that does the appending. Note that the data I am using is confidential so I have just put placeholders for variable and file names. I would like to use the actual folder names to save the appended files.

    Code:
    global input "F:\input\"
    global output "F:\output\"
    global temp "F:\temp\"
    
    local folder_names `" "Folder_1" "Folder_2" "'
    
    foreach folder of local folder_names{
        tempfile building
        di "$input""`folder'"
        cd "$input""`folder'"
        local filelist: dir "." files "*.csv"
        foreach file of local filelist{
            append using `building'
            save "$temp"`"`building'"',replace
        }
    use "$temp"`building', clear
    save "$output""`folder'"
    }
    I am sure there are other errors in this code that I haven't reached yet, but I am having the simplest issue that I do not know how to resolve. My display command in the second line of my first for loop displays the correct path, but when I use
    Code:
    cd "$input""`folder'"
    I get an "invalid syntax" error, and I am not entirely sure why.

    Also, just as a reference, I have used these other two posts to construct this code:

    https://www.statalist.org/forums/for...umber-of-files

    https://www.statalist.org/forums/for...path-file-name







  • #2
    Code:
    global input "F:\input" //Removed backslash, which escapes the local macro reference in the cd command
    global output "F:\output" //Removed backslash, which escapes the local macro reference in the cd command
    
    local folder_names `""Folder_1" "Folder_2""'
    
    foreach folder of local folder_names{
        clear
        cd "${input}/`folder'" //Added a forward slash
        local filelist: dir "." files "*.csv"
        save "${output}/`folder'", replace emptyok
        foreach file of local filelist{
            import delimited "`file'",clear
            append using "${output}/`folder'"
            save "${output}/`folder'", replace
        }
    }
    You may need to add options the import command to fit your data, i.e. dealing with variable names, row numbers, etc.

    Also see: https://www.stata.com/support/faqs/p...es-and-macros/ for more information on backslashes/forward slashes and escaping macro references.
    Last edited by Ali Atia; 08 Jul 2021, 15:15.

    Comment


    • #3
      Well, I'm not exactly certain what the source of your difficulty is, but I think that you are getting tripped up by the \` escape sequence that results when the \ at the end of your paths get juxtaposed with the ` at the beginning of your file local macro. To simplify matters and avoid unnecessarily cumbersome and complicated syntax, I suggest several changes.

      1. Use local macros not globals. Global macros are unsafe and should never be used except when there is no alternative. In over 25 years of Stata programming, I have only been forced to use a global macro once. They are rarely needed and should be avoided like the plague.

      2. Use /, not \ as your path separator. Even though Windows uses \ as the separator, Stata will automatically translate / back to \ for you when it passes file handling requests to Windows. Not only does this avoid the problem with the \` sequence that arises when Windows paths and local macro filenames are combined, it also makes your code portable to other OS's without modification of paths.

      Code:
      local input "F:/input/"
      local output "F:/output/"
      
      local folder_names `" "Folder_1" "Folder_2" "'
      
      foreach folder of local folder_names{
          tempfile building
          clear
          save `building', emptyok
          cd `input'`folder'
          local filelist: dir "." files "*.csv"
          foreach file of local filelist{
              import delimited using `file', clear
              append using `building'
              save `"`building'"', replace
          }
      save `output'`folder'"
      erase `building'
      }
      Note: not tested. Beware of typos, etc.

      Finally, I can't let this one go without a general warning about bulk appending. It is pretty unusual for a large number of files to be fully compatible in the way they use data storage types, variable names, and value labeling. Even data sets from high-quality data vendors who carefully curate their data often contain inconsistencies--and when you append inconsistent data sets together, at best, you get frustrated with tons of error messages, and, at worst you lose information, and sometimes end up with garbled information. I strongly recommend that you separately import each .csv file to .dta first, and then go back and harmonize those .dta sets before trying to put them together. Mark Chatfield's -precombine- program, available from Stata Journal (net describe dm0081, from(http://www.stata-journal.com/software/sj15-3)) is a very helpful tool to screen the data sets for compatibility before combining them. It will alert you to all of the common types of inconsistencies that arise in bulk data, and you can then fix those ahead of combining them.

      Added: Crossed with #2

      Comment

      Working...
      X