Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging one variable with the same name from multiple files

    Hi,

    I have 8200 files with two variables the first is the date (timestamp) and amount of people holding the specific stock (users_holding).

    The problem is when I merge two files the variable "users_holding" has the same name and I would like the variable to be the same name as the name of the file.

    So let's say that I would like to merge Microsoft and Apple, what I would like to do, if possible, is to change the variable "users_holding" to Microsoft and the same for Apple.

    Since I am rather inexperienced with Stata I would like to ask if a foreach command that could merge all the "users_holding" variables from all the 8200 files while renaming each variable to the name of the file it is being merged from.

    I do apologize if I violate any rules in regards to creating this post.

    I would like to say thank you very much in advance,

    Mathias Sorensen

  • #2
    And are any of the file names longer than 32 characters? Do any of them have spaces, dashes, or other non-alphabetic characters in the name? Do any of them start with a digit rather than a letter?

    If so they are not suitable as a variable name.

    Without knowing what you are trying to accomplish, I will suggest that appending the 8200 datasets rather than merging them would be more likely to leave you with a dataset more usable in your subsequent analyses. What you are describing is a "wide" layout of your data, and what I am describing is a "long" layout. The experienced users here generally agree that, with few exceptions, Stata makes it much more straightforward to accomplish complex analyses using a long layout of your data rather than a wide layout of the same data.



    Comment


    • #3
      Hi William

      Thank you very much for your answer/question.

      The names of the files are between 1-5 letters, so should be suitable as a name of a variable.

      Very interesting point and I will look into appending rather than merging, incase this makes my analyses much smoother.

      Thank you once again,

      Mathias Sorensen

      Comment


      • #4
        May I make a suggestion?

        In figuring out how best to do what you need to do, start by making a directory containing copies of, say, 5 of your input files, and develop your code using just those files, be it by merging (you can just rename the variable in each of the 5 datasets manually) or by appending. You can use the results to begin developing your analysis, and it will be ever so much easier to learn how to use this data in a long layout if you aren't overwhelmed by the size of your data.

        Start small and figure out the analysis you need to do, then see if it scales up to a larger solution. You have to walk before you can run.

        Comment


        • #5
          Hi William

          Thank you very much for you advice, I have tried to append 10 files first, which went smooth but now that I want to append even more files, I keep getting the error r(110):


          clear
          save appended, emptyok
          local filelist: dir . files "*.dta"
          foreach file of local filelist {
          use `"`file'"', clear
          generate stock = `"`file'"'
          append using appended.dta
          save appended.dta, replace

          }
          Click image for larger version

Name:	Screenshot 2021-07-05 140507.png
Views:	1
Size:	8.2 KB
ID:	1617518



          What I wanted to do is to have a variable that contains the name of the file, so I can know, which company the data belongs to.

          Can you spot what is wrong with my code?

          Mathias Sorensen

          Comment


          • #6
            Your problem is that you have saved your appended data into the same directory as your unappended data, so the local filelist includes the output file as well as all the input files. Perhaps
            Code:
            erase appended.dta
            local filelist: dir . files "*.dta"
            clear
            save appended, emptyok
            ...
            will resolve your problem

            Comment


            • #7
              Just tried to run the below

              Code:
              erase appended.dta
              local filelist: dir . files "*.dta"
              clear
                save C:\Users\mathi\Desktop\Bachelor\Data\appended.dta, emptyok
              local filelist: dir . files "*.dta"
              foreach file of local filelist {
                use `"`file'"', clear 
                 generate stock = `"`file'"'   append using appended.dta
                save C:\Users\mathi\Desktop\Bachelor\Data\appended.dta, replace
              }
              and still getting the error code r(110)

              It appends about 200 files and then I get the error.

              Comment


              • #8
                Remove the second local filelist command. You have recreated the list of files to include appended.dta. The point is to create the file list before the appended.dta file is created so that it doesn't appear in the list, and not replace the file list after appended.dta is created.

                Comment


                • #9
                  Thank you very much William it worked like a charm!

                  Comment

                  Working...
                  X