Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • importing data from excel sheets

    Hello dear Stata users,
    I want to import data from excel to Stata,
    My data is in one excel file but in different sheets (sheet1, sheet2, sheet3.........................sheet50)
    If i do it one by one it will take lots of time.
    can you help me how to import using loop or any other short way.

    Thank you : )

  • #2
    If the sheets have these names, then the basic loop is

    Code:
    forvalues j = 1/50 {
        import excel using my_excel_file ////
        , sheet("Sheet`j'") clear
        save my_dta_file_`j'
    }
    This is implemented in xls2dta (SSC), which also makes it comfortable to combine the imported files and more.

    Code:
    ssc install xls2dta
    cd wherever_the_excel_file_is
    xls2dta , allsheets : import excel using my_excel_file
    Best
    Daniel

    Comment


    • #3
      thank you daniel klein , you are so kind.

      these commands save every sheet's data in a separate Stata file.
      i want a way to import data from all sheets in one Stata file

      Last edited by Milad Ahmad; 25 Apr 2016, 06:03.

      Comment


      • #4
        i want a way to import data from all sheets in one Stata file
        That is not enough information. How do the files fit together? Look at

        Code:
        help append
        help merge
        and try to figure out which one you want, or if you want something else entirely. Sticking with xls2dta you can then

        Code:
        // import/convert the files
        xls2dta , allsheets : import excel using my_excel_file
        // combine the files into one
        xls2dta , clear : append
        where you, naturally, put merge instead of append, should this be what you want. You can, alternatively, skip the conversion of each individual file

        Code:
        xls2dta , clear allsheets : append using my_excel_file
        Best
        Daniel

        Comment


        • #5
          Thank you
          Last edited by Milad Ahmad; 25 Apr 2016, 21:42.

          Comment


          • #6
            Hi!

            I am trying to do a similar thing using xls2dta. I am trying to merge 4 different time points of data collection from excel for each participant but getting an error that "variable participant not found". Also, it saves all the sheets separately instead of as one dataset. When I open it, I notice that it took the excel columns A, B, C instead of firstrow as my variable names. However, I cannot seem to successfully insert "firstrow"

            This is the coding that I tried to use:

            xls2dta , allsheets : import excel using "C:\Projects\Combined in Excel 3-9-2017.xlsx"
            xls2dta , clear : merge 1:1 participant


            Comment


            • #7
              But how have you tried to insert firstrow? The import excel call really works just like it would without the xls2dta prefix

              Code:
              xls2dta , allsheets : import excel "C:\Projects\Combined in Excel 3-9-2017.xlsx" , firstrow
              If you want one file directly, then

              Code:
              xls2dta , clear allsheets importopts(firstrow) : merge 1:1 participant using "C:\Projects\Combined in Excel 3-9-2017.xlsx"
              By the way, with four time points for each participant, you are more likely to want append rather than merge - but this is a guess.

              Best
              Daniel
              Last edited by daniel klein; 10 Mar 2017, 06:32.

              Comment


              • #8
                Thank you for your help daniel klein! However, it says "participant not found" using your second code. My last iteration was: xls2dta , allsheets: import excel using "Combined in Excel 3-9-2017.xlsx", clear firstrow and it worked to get the correct variable names, but it was in separate .dta documents instead of combining. I think it needs to be wide as I want: Particpant 1 Score1T1 Score1T2 Score 1T3 age sex etc.
                Have a good weekend,
                Noemi

                Comment


                • #9
                  Originally posted by Noemi Kiss View Post
                  but it was in separate .dta documents instead of combining.
                  You can combine the separate datasets with

                  Code:
                  xls2dta , clear : merge 1:1 participant
                  right after importing the files.

                  I think it needs to be wide as I want: Particpant 1 Score1T1 Score1T2 Score 1T3 age sex etc.
                  I cannot comment on this as I do not see your data. But if the variable names in the Excel files are already in this form, i.e. Participant1, Scorte1T1, ... then Stata cannot find participant. Also note that Stata is case sensitive.

                  Perhaps you will need to loop over the separate imported files to change the variable names to what you want. Maybe appending then reshaping would be easier. It all depends on the details.

                  Best
                  Daniel

                  Comment


                  • #10
                    If there is no variable participant, Stata cannot merge using such a variable. You will have to explain how the participant can be identified. Is this info found in the sheet name, ordering of observations, or?

                    Comment


                    • #11
                      Originally posted by daniel klein View Post

                      You can combine the separate datasets with

                      Code:
                      xls2dta , clear : merge 1:1 participant
                      right after importing the files.
                      Dear Daniel,

                      Suppose that I have two excel files (In reality, many excel files), test1.xslx and test2.xslx in the directory E:\test. The first row of two files denotes the variables names. In terms of Stata format, they look like
                      Code:
                      * Example generated by -dataex-. To install: ssc install dataex
                      clear
                      input str5 country int year double var1 int var2
                      "China" 1993 5.76 764
                      "China" 1994 8.62 863
                      "China" 1995 8.35 958
                      "China" 1996 8.31 105
                      "China" 1997 8.29 115
                      "USA"   1993    1 875
                      "USA"   1994    1 910
                      "USA"   1995    1 936
                      "USA"   1996    1 971
                      "USA"   1997    1 102
                      end
                      and
                      Code:
                      * Example generated by -dataex-. To install: ssc install dataex
                      clear
                      input str5 country int year double score
                      "China" 1995 51.97
                      "China" 1996 51.32
                      "China" 1997 51.68
                      "China" 1998  53.1
                      "China" 1999 54.76
                      "USA"   1995 76.65
                      "USA"   1996 76.72
                      "USA"   1997 75.59
                      "USA"   1998 75.43
                      "USA"   1999 75.54
                      end
                      How can I merge these two files (1:1 country year) using your interesting command "xls2dta"? (starting from the excel format!)
                      Ho-Chuan (River) Huang
                      Stata 19.0, MP(4)

                      Comment


                      • #12
                        xls2dta is not really intended to work with a small number of selected files. This can easily be done with import excel. I will thus show how to combine all files in your folder, although I think I have already done this before. In a two-step process, where you first convert the individual files you type

                        Code:
                        xls2dta : import excel e:/test , firstrow
                        xls2dta , clear : merge 1:1 country year
                        Skipping the first step, i.e. not converting and saving the individual files

                        Code:
                        xls2dta , clear importopts(firstrow) : merge 1:1 country year using e:/test
                        Obviously, not all cases will be matched as some of the country year combinations do not exist in all files.

                        Best
                        Daniel

                        Comment


                        • #13
                          Originally posted by daniel klein View Post


                          Also note that Stata is case sensitive.

                          Silly me, it was "Participant" and not "participant". The original code works now :

                          Code:
                          xls2dta , clear allsheets importopts(firstrow): merge 1:1 Participant using "Z:\Projects\Combined in Excel 3-9-2017.xlsx"
                          It combines the excel sheets in the excel document using "Participant" and creates a combined Stata dataset.


                          Thank you Daniel!

                          Comment


                          • #14
                            Dear daniel, I didn't see this reply until now. Thanks a lot.
                            Ho-Chuan (River) Huang
                            Stata 19.0, MP(4)

                            Comment


                            • #15
                              Dear daniel, I feel that this -xls2dta- command is so helpful. Thus, in addition to its help file, do you have any plan to put more examples (say, with artificial excel files) in a website?

                              Ho-Chuan (River) Huang
                              Stata 19.0, MP(4)

                              Comment

                              Working...
                              X