Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Moses: Please show the code you actually used!

    Comment


    • #17
      Sure, current version of the code is:
      Code:
      local files : dir . files "*"
      display `"`files'"'
      
      foreach f of local files {
          import excel using `"`f'"', sheet("Sheet1") firstrow
          save "`"`f'"'.dta", replace
          clear
      }
      And I am currently working on a way to deal with those quotation marks in the local list items. Going to try editing them out using -substr- or similar, or perhaps the extended functions within -display- . I would welcome any suggestions!

      Comment


      • #18
        Sorry I didn't notice the error in your save command.

        Code:
        // NEED TO REMOVE .xlsx SUFFIX FROM `f' BEFORE SAVING
        local save_name: subinstr local f ".xlsx" ""
        save `"`save_name'"'
        Notes:
        You don't need to specify a .dta suffix, and given that there are quotation marks around the filenames, when you do specify it you get things like "blahblah".dta as a filename, which confuses the parser.
        If .dta files with these name already exist, you need to add the -replace- option to the -save- command.

        Comment


        • #19
          Thank you very much, -local subinstr- worked as advertised. Program is running smoothly now! For reference, final code is:
          Code:
          local files : dir . files "*"
          display `"`files'"'
          
          foreach f of local files {
              import excel using `"`f'"', sheet("Sheet1") firstrow
              local save_name : subinstr local f ".xlsx" ""
              save `"`save_name'"', replace
              clear
          }
          Now, on to actually working with these files! Thanks again.

          Comment


          • #20
            Dear Clyde,

            I'm using your rational to loop over different files, selecting only some variables. The problem is that my final result - after the append - is not correct (i.e attributes region Kabul to US). Here is my code:


            Code:
            local files : dir . files "*.dta"
            
            local dir1 "C:\Users\Cliente\Downloads\Josselin\DHS2"  
            
            local vars hv024 hv000 hv007
            tempfile building
            save `building', emptyok
            
            foreach f of local files {
                     use `"`f'"', clear
                     sort `vars'
                     keep `vars'
                     append using `building'
                     save `"`building'"', replace
            }
            *
            label data "jj"
            save `"`dir1'/jj.dta"', replace
            Thanks in advance.

            Comment


            • #21
              There is only one thing that strikes me about the code that might produce this problem.

              The logic of the loop depends on there being no data set in active memory at the start when you -save `building', emptyok-. If you already have a data set in memory when you issue that command, that data will be, incorrectly, carried forward through the loop. So make sure you -clear- memory first.

              Other than that, the code looks correct. If that is not the problem, I suggest you post back with excerpts from your data files, posted using the -dataex- command, so that somebody can experiment with your data and try to troubleshoot it, as it seems more likely that the problem is with the data than with the code.

              Also you will need to clarify what you mean by "attributes region Kabul to US." There is nothing in the code that mentions, Kabul, the US, or the word region, so it is completely unclear what you are talking about there.

              Comment


              • #22
                Clyde Schechter I'm using a version of your code above and I get a r(601) error with "file alwal_oct22_feb23.xlsx not found" at the very first file in the Raw Data folder. I'd appreciate your help in understanding what I'm doing wrong here

                The file names are: alwal_oct22_feb23, gmrv_oct22_feb23, kar_oct22_feb23, kvc_oct22_feb23, mtc_oct22_feb23....


                Code:
                clear
                tempfile building
                save `building', emptyok
                
                local filenames: dir ""C:\Users\katja\Dropbox\Raw Data"  files "*.xlsx*"
                
                
                foreach f of local filenames {
                import excel using `"`f'"', sheet("Sheet1") firstrow
                gen source = `"`f'"'
                display `"Appending `f'"'
                append using "`building'"
                save `"`building'"', replace
                }
                
                export excel using 1_large_excel_file.xlsx, replace

                Comment


                • #23
                  It is always difficult to troubleshoot "file not found" problems remotely due to lack of access to your actual disk drive and very limited knowledge of how the files and directories are organized. But most likely the problem in this case is this. The contents of local macro filenames contain only the filenames, without the path. The fact that you specified a path in the command that created it suggests to me that the path with those files is not the current working directory. But when you then get to your -import excel- command, you are working in that directory. This means that -import excel- is looking in the wrong directory, so the file is actually not there.

                  So first figure out what your current working directory is. The command -pwd- will show it to you. Assuming the response is not "C:\Users\katja\Dropbox\Raw Data", then you have two choices. You can switch to that directory using the -cd- command (-help cd- if you are not familiar with it). Or, you can stay where you are and modify the -import excel- command to -import excel using `"C:\Users\katja\Dropbox\Raw Data/`f'"', sheet("Sheet1") firstrow-.

                  By the way, I am quite puzzled that the code you show runs at all. -local filenames: dir ""C:\Users\katja\Dropbox\Raw Data" files "*.xlsx*"- contains a surplus " after -dir-. This should cause Stata to throw some kind of error (probably invalid syntax) and stop there.

                  Comment


                  • #24
                    Scott Rick is cross-posting here without telling us about it, which is not a good idea.

                    See https://www.statalist.org/forums/for...es-error-r-601

                    There is already some duplication of effort here -- note that I already raised the question of whether Scott is working in the same directory as the dta files -- and the answer was evidently Yes,

                    Comment


                    • #25
                      Clyde Schechter and Nick Cox: I apologize for cross-posting. I had modelled my code along similar lines from another thread that I lost. When I came across this, I thought it might be better to post here as well. Clearly I was wrong. The issue did not appear to be the working directory and I'm still not sure what the problem was. I modified the code to the below which did work, while the directory remains the same. However, the file appears to be saved as a .tmp file in a different directory that I haven't mapped. And when I export to excel, I get a r(102) - too few variables specified - error. I'd greatly appreciate any guidance here.

                      Code:
                      
                      clear
                      tempfile building
                      save `building', emptyok
                      
                      cd "C:\Users\katja\Dropbox\Raw Data"
                      
                      local filenames :dir . files "*"
                      
                      
                      foreach f of local filenames {
                      import excel using `"`f'"', sheet("Sheet1") firstrow
                      gen source = `"`f'"'
                      display `"Appending `f'"'
                      tostring age, replace 
                      append using "`building'"
                      save `"`building'"', replace
                      clear
                      }
                      
                      Appending alwal_oct22_feb23.xlsx
                      age was byte now str2
                      file C:\Users\katja\AppData\Local\Temp\ST_8bc_000001.tmp saved
                      Appending gmrv_oct22_feb23.xlsx
                      age already string; no replace
                      (note: variable Consultant was str6, now str8 to accommodate using data's values)
                      (note: variable source was str21, now str22 to accommodate using data's values)
                      file C:\Users\katja\AppData\Local\Temp\ST_8bc_000001.tmp saved
                      Appending kar_oct22_feb23.xlsx
                      age already string; no replace
                      (note: variable Hospital was str11, now str13 to accommodate using data's values)
                      (note: variable source was str20, now str22 to accommodate using data's values)
                      file C:\Users\katja\AppData\Local\Temp\ST_8bc_000001.tmp saved
                      Appending kvc_oct22_feb23.xlsx
                      age already string; no replace
                      (note: variable Hospital was str10, now str13 to accommodate using data's values)
                      (note: variable source was str20, now str22 to accommodate using data's values)
                      file C:\Users\katja\AppData\Local\Temp\ST_8bc_000001.tmp saved
                      Appending madhapur_oct22_feb23.xlsx
                      age was byte now str2
                      (note: variable Hospital was str8, now str13 to accommodate using data's values)
                      (note: variable age was str2, now str8 to accommodate using data's values)
                      (note: variable Visit was str3, now str6 to accommodate using data's values)
                      file C:\Users\katja\AppData\Local\Temp\ST_8bc_000001.tmp saved
                      Appending mtc_oct22_feb23.xlsx
                      age already string; no replace
                      (note: variable Hospital was str12, now str13 to accommodate using data's values)
                      (note: variable source was str20, now str25 to accommodate using data's values)
                      file C:\Users\katja\AppData\Local\Temp\ST_8bc_000001.tmp saved
                      Appending set_oct22_feb23.xlsx
                      age already string; no replace
                      (note: variable Hospital was str9, now str13 to accommodate using data's values)
                      (note: variable age was str6, now str8 to accommodate using data's values)
                      (note: variable Consultant was str7, now str8 to accommodate using data's values)
                      (note: variable source was str20, now str25 to accommodate using data's values)
                      file C:\Users\katja\AppData\Local\Temp\ST_8bc_000001.tmp saved
                      
                      . 
                      . export excel using 1_large_excel_file.xlsx, firstrow(variables) replace
                      too few variables specified
                      r(102);

                      Comment


                      • #26
                        Thanks for #25. Although your earlier problems still appear mysterious, your present puzzlement can be explained from your code.

                        1.
                        However, the file appears to be saved as a .tmp file in a different directory that I haven't mapped
                        Indeed. Stata is doing what you asked. tmpfile doesn''t create a file but it creates a name that is of a file that may be created and that will be in a directory or folder for such files.

                        If I go

                        Code:
                        tempfile foo 
                        di "`foo'"
                        I see a similar name to yours in such a directory on my machine. The .tmp extension doesn't override the file being equivalent to a .dta file as produced by save and as modified by append.

                        2.
                        r(102) - too few variables specified - error
                        Your loop issues a clear at the end of each iteration, so when the loop is over you have no data in memory. You need (a) to read the dataset back in again, or otherwise (b) not to clear on the final iteration, which is harder to program. So

                        Code:
                        use "`building'"
                        before you try the export excel.



                        .

                        Comment


                        • #27
                          Thanks Nick Cox. That worked perfectly!

                          Comment

                          Working...
                          X