Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xls2dta update available from SSC

    Thanks as always to Kit Baum, an updated version of xls2dta is now available from the SSC. To install the update type in Stata

    Code:
    ssc install xls2dta
    or use the adoupdate command.

    xls2dta is a wrapper for import excel and converts multiple Excel files and/or worksheets to Stata datasets. The update features a completely revised syntax, conceptualizing xls2dta as a prefix command, and mainly addresses the combination of these Excel files, a problem repeatedly discussed on Statalist (cf. this recent example). The solution is easy enough and essentially the same loop has been posted by different people in different places.

    Drawing on the example given above, if we want to convert all worksheets in an Excel file, say myxlsfile.xls, to Stata datasets, then append them into one file, we could type something like

    Code:
    xls2dta ,save(c:/mydtafiles) allsheets : import excel myxlsfile.xls
    xls2dta ,clear : append
    The first line of code saves all worksheets in myxlsfile.xls as Stata datasets in the directory c:/mydtafiles. The names will be myxlsfile_1.dta, myxlsfile_2.dta, …. The second line clears the data in memory, appends all files converted before and loads the combined dataset.

    We could also have restricted the subset of worksheets to be converted (as requested here). Say we would like to append the first ten worksheets only. We would then type

    Code:
    xls2dta ,clear sheets(1/10) : append using myxlsfile.xls
    By specifying a filename with the append command, we do not save the individual worksheets as Stata datasets, but create and load the appended file on the fly. I should mention that xls2dta supports merge and joinby, too.

    If, instead of worksheets, we wanted to convert multiple Excel files (as requested here), we could do so specifying a directory name with import excel

    Code:
    xls2dta ,save(c:/mydtafiles) : import excel c:/myxlsfiles/
    This line converts all .xls and .xlsx files in c:/myxlsfiles to Stata datasets.

    Another new feature of xls2dta relies on Robert Picard's filelist (SSC) and searches directories as well as their subdirectories (as requested here). Extending the above example to include all .xls and .xlsx files in any subdirectory found in c:/myxlsfiles, we would add the new recursive option

    Code:
    xls2dta ,save(c:/mydtafiles) recursive : import excel c:/myxlsfiles/
    After converting the files, we might want to do some additional data management tasks. For example we might want to convert string variable A to numeric variable numeric_A in each dataset. Since xls2dta saves filenames in s(), we could type

    Code:
    local N = s(n_dta)
    forvalues n = 1/`N' {
        use "`s(dta_`n')'" ,clear
        destring A ,generate(numeric_A)
        save “`s(dta_`n')'” ,replace
    }
    This basic loop is implemented in xls2dta and we can get the same result typing the one line

    Code:
    xks2dta : xeq destring A ,generate(numeric_A)
    I hope some of you will find the command useful.

    Best
    Daniel
    Last edited by daniel klein; 07 Jun 2015, 11:11. Reason: Post got messed up due to failing internet connection

  • #2
    Thanks Daniel. The merge and append options are extremely useful additions.

    Kind regards,
    Imed.

    Comment

    Working...
    X