Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • loop over several dta files

    Hi. I am new at trying to loop files so, my question may be trivial but I have tried to search for answers and not found much help. I appreciate your help and feedback.

    I have 60 countries dta file that look like this:

    Angola_2006_2010_Panel.dta
    Colombia_2006_2010_Panel.dta
    India_2007_2012_Panel.dta
    .
    .

    And, I want to try to estimate the levpet in-built stata command and know how to do it individually for each country. This is the code for that after cleaning some data, replacing zeros, creating logs and so on.

    tsset id year
    keep id year lnr lnv lnm lnl lnk lni lne lna M R V L K I E A

    levpet lnv , free( lnl ) proxy( lne ) capital( lnk ) valueadded

    I am really lost on how to create a simple code that let me read all these files with a loop instead of having to do all the countries individually. I have tried to append files but for some countries, I get error messages. Is there a way that will allow me to do so?

    Thanks for your help.

    -Sara


  • #2
    What error message do you get when you append files?

    Comment


    • #3
      That a variable title (a1) is str10 in using data. Or the variable title ( d1a1x) is byte is using data. it works fine for some files, but not for all.

      Comment


      • #4
        You're going to need to do some work on your datasets to make sure that all the common variables are of the same type across all the datasets.
        That is, a1 needs to be string in all the datasets to be able to append them. See help destring and help tostring for more information on changing variable types.

        Alternatively, you may be able to get away with just keeping the variables you need for the analysis.

        That would look something like
        Code:
        use Angola_2006_2010_Panel.dta
        keep id year lnr lnv lnm lnl lnk lni lne lna M R V L K I E A
        append using Colombia_2006_2010_Panel.dta
        keep id year lnr lnv lnm lnl lnk lni lne lna M R V L K I E A
        append using India_2007_2012_Panel.dta
        As long as your main variables of interest are all of the same type you shouldn't have any problem. Of course if you decide to do additional analyses you may find there are other variables you needed.

        Comment


        • #5
          Thanks. I will try to destring the main variables of interest and see if that works. However, is there a way to use the forval /foreach loop command here? Otherwise, this will take me a long time to get the results with 60 datasets.

          Comment


          • #6
            You don't have to append the files to run regressions (individual by country), but you could with a code similar to below. Best, Sergiy Radyakin

            Code:
            clear all
            
            local folder "C:/temp/"
            local vars "price weight length"
            local files "`c(Mons)'" // lazy list
            
            // data preparation
            foreach f in `files' {
              sysuse auto, clear
              save "`folder'`f'.dta", replace
            }
            
            //simulate problem with different data types for irrelevant variables
            generate mstr=string(mpg)
            drop mpg
            rename mstr mpg
            save, replace
            // data preparation complete. TS should have written the code above.
            count  // 74
            
            local w1 `"`: word 1 of `files''"'
            use "`folder'`w1'.dta", clear
            foreach f in `files' {
              if (`"`f'"'==`"`w1'"') continue
              append using `"`folder'`f'"'
              keep `vars'
            }
            
            count  //74*12=888

            Comment


            • #7
              I agree with Sarah E. that it sounds like part of the problem is that the files aren't clean, e.g. in some files variables are strings while in others they aren't. Personally, I would try to get the 60 files cleaned up first. Then it would probably be easy to do what you want. Barring that, you could probably tweak Sergiy's code to fix things as needed, e.g. you could check to see if a variable is string and if so convert it to numeric.
              -------------------------------------------
              Richard Williams, Notre Dame Dept of Sociology
              StataNow Version: 19.5 MP (2 processor)

              EMAIL: [email protected]
              WWW: https://academicweb.nd.edu/~rwilliam/

              Comment


              • #8
                Thanks. I will try to see where I get on the suggestions

                Comment


                • #9
                  Hi all. Good Morning.
                  I have been looping to generate frequency tables with the fre command over 14 datasets.
                  Everything goes ok, but sometimes there is some dataset that does not have that variable and the loop stops however, it's not a problem.
                  What I want is to identify each output to which database it belongs, so that I can identify the outputs faster.
                  I copy the loop that I use and I would like some suggestions of what else I should include to get what I need.
                  Thanks in advance.
                  Juan.

                  Code:
                  cd "C:\statadatasets"
                  local i : dir "C:\statadatasets" files "*.dta"
                  foreach file in `i' {
                  use `file', clear
                  capture noisily fre mt
                  }

                  Comment

                  Working...
                  X