Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Looping all files in the folder

    I have all .xlsx files in a folder.
    I am trying to clean the data with simple commands.

    I could successfully do it one file by one.
    My command is below:

    //
    import delimited "C:\Users\phh84\Google Drive\Data\barley_2016.csv", clear
    keep year period state agdistrict county commodity dataitem value

    export excel using "C:\Users\phh84\Google Drive\Data\cleaned data\barley_2016.xlsx", sheet("forgams") sheetreplace firstrow(variables)
    //

    So I'm trying to do all the files in this folder. I tried below command:

    //
    local files : dir "C:\Users\phh84\Google Drive\Data\cleaned data" files "*.xlsx"

    cd "C:\Users\phh84\Google Drive\Data\cleaned data"

    foreach file in `files' {
    import delimited `file', clear
    qui keep year period state agdistrict county commodity dataitem value
    qui export excel using 'file', sheet("forgams") sheetreplace firstrow(variables) replace
    }
    //
    I get below error message.
    I don't know why I get this error.

    Note: 431 binary zeros were ignored in the source file. The first instance occurred on line 1. Binary zeros are not valid in text data.
    Inspect your data carefully.
    (27 vars, 442 obs)
    variable year not found
    r(111);

    Thanks.

  • #2
    Here's the problem. Your macro `files' is a list of .xlsx files, but you are trying to read them with -import delimited-. According to what you said, you actually want to start with .csv files. So it should be:

    Code:
    local files : dir "C:\Users\phh84\Google Drive\Data\cleaned data" files "*.csv"
    
    cd "C:\Users\phh84\Google Drive\Data\cleaned data"
    
    foreach file in `files' {
        import delimited `file', clear
        qui keep year period state agdistrict county commodity dataitem value
        local excel_name: subinstr local file ".csv" ".xlsx"
        qui export excel using 'excel_name', sheet("forgams") sheetreplace firstrow(variables) replace
    }
    Note that if your filenames contain embedded blanks, then you may need to bind all references to `file' and `excel_name' in quotes.

    Comment


    • #3
      Just a style tweak: if the cd is done first, then you don't need to specify the directory (except as current).

      Comment


      • #4
        Just wondering why you would use Stata to export to Excel again? If you plan on doing any serious analysis keep the data in Stata. For related problem also see xls2dta (SSC).

        Best
        Daniel

        Comment


        • #5
          Clyde Schechter Thanks a lot. I got hint form your precious comment. Now everything is squared.

          Comment

          Working...
          X