Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to merge different Excel File.

    Hello Sir,

    I have 4 separate -excel files required to merge in one excel file. This is the longitudinal data sets but it has a different range of individuals. The individuals and duration vary each other. For example, One excel sheet has 38 individuals and 30 years but others have 50 individuals and 35 years. I need to sort common individuals and the number of years, that make the balanced data.

    Noted that my professor told me to use the following code to sort the data.
    NACE REV. 2 - ISIC REV. 4
    Source Target
    gen nace==. A A
    replace nace= 1 if isic== 1

    I don't have any idea how I can do it. Pls, thanks for your kind cooperation.

  • #2
    I am unable to follow your explanation of the problem. I don't understand what these spreadsheet files contain, nor what you want for the final result.

    Perhaps somebody else will understand and respond. But if you don't receive a helpful response from somebody else in the course of a day, here's what I recommend you do:

    You will need to import all of these into Stata as a first step no matter what. So use the -import excel- command to do that. Then use the -dataex- command to post examples of the data from each of those four files. Then finally, make up, by hand, a Stata data set that looks like the result you want to get, and use -dataex- to post an example of that. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Comment


    • #3
      Dear Sir,

      Uploaded two sample sets. Variables are Export index and Impor Index contained data for manufacturing industries. I need to sort these data sets to make a common & compatible excel file for Stata operation.
      Attached Files

      Comment


      • #4
        Thank you, This is not quite how -dataex- should be used: the output of -dataex- should have been copied to your computer's clipboard and then pasted into the Forum editor from there; the screenshot is much harder to work with (and for more complicated situations it would be altogether useless). In the future, please bear that in mind.

        Fortunately, your screenshots are readable on my setup, and they happen to provide enough information for me to give you some advice about your original question.

        The first thing that strikes me is that you are not using the -import excel- command in the best way. In the screenshots it appears that the first row of the spreadsheets contain variable names rather than actual data. You can do better by specifying the -firstrow- option, and Stata will then give you the contents of the first row as variable names (where there is such content). The other thing that becomes clear to me is that the organization of your data calls for combining these files with the -append- command, not -merge-.

        I don't know what the names of the four files are. For the code below, I illustrate by assuming their names are one, two, three, and four, respectively (without the .xls or .xlsx extension).

        Code:
        clear*
        local filenames one two three four
        tempfile building
        save `building', emptyok
        
        foreach f of local filenames {
            import excel using `"`f'.xlsx"', firstrow case(lower) clear
            append using `building'
            save `"`building'"', replace
        }
        At the end of this code, you will have all four spreadsheets combined into a single data set in active memory, which you can then save as a data file.

        There are a few things you will need to do to make the data more usable. The dates in your data set are not workable as they are. You will need to fill in the missing values in the variable that denotes the year. Then you will want to combine the year and quarter into a real Stata internal format quarterly date. I think the year will still be called D and the quarter will be called E with the code above. And I believe that D will come in as a numeric variable rather than a string. If so, the code to do this will be:

        Code:
        replace D = D[_n-1] if missing(D)
        replace E = subinstr(E, " Quarter", "", .)
        destring E, replace
        gen int qdate = yq(D, E)
        assert missing(qdate) == missing(D, E)
        format qdate %tq
        order qdate, before(D)
        drop D E
        Finally, be aware that when you are putting together a series of files, even if they come from a very reliable source of well-curated data, it often turns out that there are little inconsistencies among them which make problems. Sometimes a variable that is numeric in one of the data sets is a string in the other, perhaps due to a stray character, or sometimes for good reasons. Sometimes the name of the variable is slightly different: difference in capitalization or spelling or some other minor issue. With only four data sets, there is a reasonable chance that you won't have any problems like this. But don't be shocked if you do: be prepared to make fixes.

        Comment


        • #5
          Thanks for your support

          Comment

          Working...
          X