Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Import Excel files with different names in a loop across different subfolders

    Hello everyone,

    The problem I have is quite common. I would like to import a number of Excel files from different subfolders of a specific directory, but only those that pertain to a specific city. The issue is that the naming of the files is not consistent and I am unable to edit them. For example, in one folder, a city may be named "1234NewYork.xlsx" and in another "3112NY.xlsx" or "New_York.xlsx."

    I am unable to find a systematic way to fix this without editing the files, and since I would like to loop my code over different cities, I believe I will have to use macros. I am wondering how I can define a set of possible keywords for each city in my folder and group them together in a way that Stata understands that a certain subset of these keywords belong to one city, so the files containing them should be appended.

    Something like this:

    local ny1 "New York"
    local ny2 "New_York"
    local ny3 "NY"
    local ny4 "N_Y"
    ...
    local la1 "Los Angeles"
    local la2 "LA"
    ...

    My goal is to have a number of appended files that corresponds to the number of cities. But I'm not really sure how I can make this work for each "group" of keywords. Please note that each city does not necessarily have the same number of files to be appended

    Can someone help me? Any advice would be greatly appreciated!

  • #2
    Show us your looping code when the names are consistent so that we can work out how to modify it to meet your needs.

    Comment


    • #3
      fs from SSC is a wrapper for the corresponding extended macro function, which sounds trivial -- but it will support multiple arguments. So it could help here. As always, avoiding false omissions and inclusions can still be tricky.


      Code:
      -----------------------------------------------------------------------------------------
      help for fs
      -----------------------------------------------------------------------------------------
      
      Show names of files in compact form
      
              fs [filespec [filespec [ ... ]]]
      
      
      Description
      
          fs lists the names of files in compact form.  Note that files do not include
          folders or directories.
      
          With no arguments, fs lists names of files in the current folder or directory.
          Otherwise, it follows file specifications.  A file specification filespec is a
          filename or (more commonly) a set of filenames indicated by the use of wildcards *
          or ?. A folder or directory prefix should be given for files outside the current
          directory.
      
          See help on dir or ls for listing the names of files with more detailed
          information. Unlike either of those commands, fs may be given more than one file
          specification. Also unlike either of those commands, fs shows any hidden files.
      
          The list of filenames is returned in r(files), so long as that is not empty.
      
      
      Remarks 
      
          It is recommended that even if you use Windows, you use forward slashes / rather
          than backward slashes \.  Stata will understand and there will then be no clash
          with other meanings for the backward slash.
      
      
      Examples
      
          . fs
      
          . fs *.dta
      
          . fs *.ado *.hlp
      
          . fs /ado/plus/f/*.ado
      
          . fs *.gph
          . foreach f in `r(files)' {
          .         graph use `f'
          .         more
          . }

      Comment


      • #4
        Thank you both for the answers. Unfortunately, I could not achieve what I wanted even with the -fs- command.

        Ideally, if the names were consistent, I would have a code like this:

        Code:
        local citieslist city_1 city_2 city_3 city_4
        local v1 "`path'/wave_1"
        local v2 "`path'/wave_2"
        local v3 "`path'/wave_3"
        
        forvalues x = 1/3 {
        foreach city of local citieslist {
        cd "`v`x''"
        fs *`city'*.xlsx
        foreach f in `r(files)' {
        import excel `f', clear allstring
        tempfile `city'_`x'
        save ``city'_`x''
        }
        }
        }
        The code has not been tested, this is just out of my imagination as I can't have my laptop currently. But the problem is that I can't create the local citieslist because the files in my different wave_x folders aren't organized in a consistent way: There could be CITY_1 in wave_1, 1234City_1 in wave_2, etc.

        Edit : My intuition tells me I could work with nested locals? But I'm not really sure I understand how this works. For instance, create a local called city_1 with all the possible stubs of city_1 that could be in my 3 different wave folders (e.g., for New York, it could be NY, New_York, New York...), and then add this local within the citieslist local. Or maybe I could enter the names of all the possible file names within a local?

        Last edited by Julia Simon; 13 Jan 2023, 02:20.

        Comment

        Working...
        X