Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Getting a substring from a string in a list of file names

    Hello,

    After browsing the forum, Stata help and the internet I still haven't found how to solve my problem, if anyone has an idea of what can be wrong in my code I will be extremely grateful and willing to thank with beers the one that will save my mental health . So my problem is the following, I have a folder with a number of files with a name pattern as DB_?.xls, I populate a local macro (called dtfl) with these names and then I loop over the files to do different operations on data. The character in "?" is the name of the data set and is also present in the ID of each observation so I need to get that character from the file name. I have tried to extract from the local the character in "?" at each loop with substr and I either get an 198 error (operator invalid) of if I try putting double quotes I get a 101 error, it seems that what works for single strings does not for lists of strings?. Here is the code:

    Code:
    local dtfl : dir . files "DB_?.xls"
        foreach d in `dtfl' {
            clear
            cd "C:/TempStata"
            di "`d'"
            local dtst = substr(`d',4,1)
            di "`dtst'"
    And this is the output I get:

    C:\TempStata
    db_a.xls
    db_a: operator invalid

    (I would like to have `dtst'="a")

    many many many thanks in advance
    Pedro
    Last edited by Pedro GarciaDeLeon; 20 Jan 2017, 09:39.

  • #2
    Code:
    local files "DB_A.xls DB_D.xls DB_K.xls DB_X.xls DB_Y.xls"
    foreach w in `files' {
      display substr(`"`w'"',4,1)
    }
    Produces:
    Code:
    A
    D
    K
    X
    Y

    Comment


    • #3
      You left out the manuals. The recommended search order is always

      help
      manuals
      StataCorp FAQs
      Statalist forum
      ...
      general Google

      This may help. I can't see what the present working directory has to do with your code as the list of filenames is in the macro first created.

      Code:
      local dtfl : dir . files "DB_?.xls"
      cd "C:/TempStata"
      
      foreach d in `dtfl' {
           clear
           di "`d'"
           local work : subinstr local d ".xls" ""
           local work : subinstr local work "DB_" ""
           di "`work'"
      }
      EDIT:

      Sergiy's code is better. He exploits the fact that only a single letter is needed.

      As for what's wrong in the original, it's a lack of " " to indicate a literal string.

      Last edited by Nick Cox; 20 Jan 2017, 09:27.

      Comment


      • #4
        Wow, thanks so much for the hyper fast replies. I did browse manuals but with no luck or enough clarity. I just tried the quoting as follows but I still get the 101 error (varlist not allowed). Could it be because the content in `dtfl' dynamic?

        Code:
            
        di `dtfl'
        foreach d in `dtfl' {
                clear
                cd "C:/TempStata"
                di "`d'"
                display substr(`"`dtst'"',4,1)
        And the output is:

        db_a.xlsdb_b.xlsdb_c.xlsdb_d.xls
        C:\TempStata
        db_a.xls
        varlist not allowed
        r(101);

        EDIT:

        Sorry, my mistake. It works up to the last display but then when I try to use `dtst' I have the varlist error again. This is the whole loop:

        Code:
            foreach d in `dtfl' {
                clear
                cd "C:/TempStata"
                di "`d'"
                *local nmd subinstr(`d',".xls","")
                local dtst = substr(`"`d'"',4,1)
                *display substr(`"`d'"',4,1)
                
                display "`dtst'"
                local Afilenames : dir "." files  $pub$edyear"_"`"`dtst'"'*.dta
                display `Afilenames'
        
                gen blankvar=.
                save C:/TempStata/`d'_SW, replace
        
                * for each data in .dta files import it to the blank dataset
                foreach data in `Afilenames' {
                    display "`data'"
                    use C:/TempStata/`d'_SW, clear
                    append using `data', force
                    save C:/TempStata/`d'_SW, replace
                    }
                drop blankvar
                }
        Last edited by Pedro GarciaDeLeon; 20 Jan 2017, 10:58.

        Comment


        • #5
          Well, no one recommended that code, or you are only citing part of what you ran, or both. I note that nowhere in your own code is the local dtst defined. Even before I would not expect that

          Code:
          di `dtfl'
          would work at all, as you need surrounding " " to stop Stata trying to interpret your input as variable or scalar names.

          Comment


          • #6
            Sure fixed that, thank you.

            So, it worked up to

            Code:
             
             display "`dtst'"
            When I try to look for other files using that letter in the next line:
            Code:
              
             local Afilenames : dir "." files  $pub$edyear"_"`"`dtst'"'*.dta
            it gives me the 101 error, I reproduced the triple quoting suggested by Sergiy but I must be missing something...

            Comment


            • #7
              Sorry, all fine now. And thanks a lot again. I fixed my last problematic line doing proper quoting:

              Code:
               
               local Afilenames : dir "." files "$pub$edyear_`dtst'*.dta"

              Comment

              Working...
              X