Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Appending initial and final year files foreach country when periods differs (and execution is remote)

    Having remote-execution access to files ccyy (cc countries and yy years), I append for each country the initial and final files of some common period (aa01 aa10 ab01 ab10 …) using code
    foreach cc in aa ab {
    use $`cc’01
    append using $`cc’10
    }
    But I do not know what to do if the period is not common (aa01 aa10 ab00 ab09 …)


  • #2
    Assuming all files are in the same directory, the following should do as you wish. I wasn't sure if you have only two files per group (the first and last years) in that directory, so I used a more general solution than would be needed if you only have two, but it will work with two or more.

    The -list sort- macro expansion will sort the files in order but this will only work if you have two digit years for the 21st century. If you have aa98, aa99, etc, these will be sorted after aa00.


    Code:
    cd "c:\directory\with\data"                  //set the current directory as the one with the data
    
    foreach cc in aa ab ac {
        clear
        local file_list : dir . files "`cc'*"       //get all files in current directory that sort with the prefix
        local file_list : list sort file_list       //sort those files in ascending ASCII order
        tokenize `"`file_list'"'                    //tokenize the list (numbered macros 1 to N)
        noi di as result "`1'"
        use "`1'"                                   //use the first file in the ordered list
        local last_file: list sizeof file_list      //count the files to get the number of the last one
        noi di as result "``last_file''"
        append using "``last_file''"                //append using the last file
        save "name of the file.dta"
        }
    Last edited by Carole J. Wilson; 23 Sep 2018, 21:15.
    Stata/MP 14.1 (64-bit x86-64)
    Revision 19 May 2016
    Win 8.1

    Comment


    • #3
      Thanks Carol for code and explanations.
      But because the remote-execution-access rules of the server, it is not possible to code about directories neither download files, but just send code and receive results.

      Perhaps my question could be simplified as follows:.
      having the files a b c d … , how to code : .
      use a
      append b
      use c
      append d

      Comment


      • #4
        This should work:

        Code:
        clear
        local file_list : dir . files "*"       //get all files in current directory 
        local file_list : list sort file_list       //sort those files in ascending ASCII order
        tokenize `"`file_list'"'                    //tokenize the list (numbered macros 1 to N)
        local last_file: list sizeof file_list      //count the files to get the number of the last one
        use "`1'"                                   //use the first file in the ordered list
        forval i=2/`last_file' { 
            append using "``i''"                //append files 2 thru N
            }
        Stata/MP 14.1 (64-bit x86-64)
        Revision 19 May 2016
        Win 8.1

        Comment


        • #5
          Thanks again Carole, but I am afraid that the last code appends files 2/N to file 1, but my task is to append file 2 to file 1, file 4 to file 3, ....

          Comment


          • #6
            In that case, the code in #2 should work (just ignore the first command to change the directory since you will already be in the directory).
            Stata/MP 14.1 (64-bit x86-64)
            Revision 19 May 2016
            Win 8.1

            Comment


            • #7
              Thanks again for your help Carole.
              Yes, I was wrong, #2 it appends not the whole but by pairs.
              However the next code (for a few files as an example)

              local i a1 a2 b3 b4
              tokenize `"`i'"'
              use "`1'"
              forval i=2/4 {
              append using "``i''"
              }

              It appends a2 to a1, b3 to a1 and b4 to a1, but what I need is to append a2 to a1 and b4 to b3 (two years for each country)
              I have been reading about double loops, foreach and for nested and parallel, but …

              Comment


              • #8
                The code in #2 assumes you have a list of files in your directory like this:
                Code:
                . dir
                  <dir>   9/24/18 20:28  .                 
                  <dir>   9/24/18 20:28  ..                
                   6.3k   7/25/16 10:23  aa01.dta          
                   6.3k   7/25/16 10:23  aa02.dta          
                   6.3k   7/25/16 10:23  aa03.dta          
                   6.3k   7/25/16 10:23  aa10.dta          
                   6.3k   7/25/16 10:23  ab00.dta          
                   6.3k   7/25/16 10:23  ab01.dta          
                   6.3k   7/25/16 10:23  ab09.dta          
                   6.3k   7/25/16 10:23  ac01.dta          
                   6.3k   7/25/16 10:23  ac08.dta
                The first pass of the loop takes all files with the prefix "aa" (so files aa01.dta thru aa10.dta) and stores them in `file_list', then tokenizes them into the macros `1' , `2', etc to N. If you -use `1'- and then -append using `last_file'-, that should work. (You do need to change the file name of the saved file to something that changes so it isn't overwritten each time.) It will also work if you only have two files with each prefix.

                When I run it, I get the following output:
                Code:
                aa01.dta
                (1978 Automobile Data)
                aa10.dta
                (label origin already defined)
                file name of the file.dta saved
                ab00.dta
                (1978 Automobile Data)
                ab09.dta
                (label origin already defined)
                file name of the file.dta saved
                ac01.dta
                (1978 Automobile Data)
                ac08.dta
                (label origin already defined)
                file name of the file.dta saved

                If you are providing the full list of files in a macro, then that's a different solution:
                Code:
                local i a1 a2 b3 b4
                local nfiles: list sizeof i
                local end=`nfiles'-1
                local j=1
                while `j'<=`end' {
                    clear
                    local first: word `j' of `i'
                    use "`first'"
                    local ++j
                    local second: word `j' of `i'
                    append using "`second'"
                    noi di as result "File `second' has been appended to file `first'"
                    save "`first'_`second'.dta", replace
                    local ++j
                    }
                Stata/MP 14.1 (64-bit x86-64)
                Revision 19 May 2016
                Win 8.1

                Comment


                • #9
                  First of all, I would like to apologize because I do not behave naturally by just executing your code and discussing its results. But I cannot do it because the code is executed remotely by the owner of the data and they abort any code that contains commands such as list, cd, dir ... And that’s why I waste your time with my versions.

                  Second, database use “file names” ($uk99h, $us00p, etc.) that are actually global macros that represent the full path of the data set I have to work with. There are more than two files with each prefix, but in this job I just wish to use two files with each prefix (the two suffix are not the same for each prefix).

                  Thus, for example, if I send the your last code (adapted)

                  local i $au01h $au10h $at04h $at13h
                  local end=4-1
                  local j=1
                  while `j'<=`end' {
                  clear
                  local first: word `j' of `i'
                  use "`first'"
                  local ++j
                  local second: word `j' of `i'
                  append using "`second'"
                  noi di as result "File `second' has been appended to file `first'"
                  save "`first'_`second'.dta", replace
                  local ++j
                  }


                  I get the following answer

                  (au01: version 7.0 6 Nov 2014 12:13)
                  (some notes about accommodate variables using data's values)
                  (some information about labels already defined)
                  File /media/share/lisdata/stata/au10ih.dta has been appended to file /media/share/lisdata/stata/au01ih.dta
                  (note:file /media/share/lisdata/stata/au01ih.dta_/media/share/lisdata/stata/au10ih.dta.dta not found)
                  file /media/share/lisdata/stata/au01ih.dta_/media/share/lisdata/stata/au10ih.dta.dta could not be opened
                  r(603);

                  I try removing something such as save without .dta, etc., but It only appends the first pair of files.

                  Comment


                  • #10
                    At least one problem has to do with the path names and extensions in the final saved file. The easiest solution would simply be to change the save command to something less verbose & meaningful. If you need your path, then you can hard code it in.

                    Code:
                    local i $au01h $au10h $at04h $at13h 
                    local end=4-1
                    local j=1
                    while `j'<=`end' {
                      clear
                      local first: word `j' of `i'
                      use "`first'"
                      local ++j
                      local second: word `j' of `i'
                      append using "`second'"
                      noi di as result "File `second' has been appended to file `first'"
                      save "newfile_`j'.dta", replace
                      local ++j
                     }
                    If the paths differ by prefix and you need the original file names included in the saved file, then that would take some work to strip away the path and extension then recreate a new file name.

                    Stata/MP 14.1 (64-bit x86-64)
                    Revision 19 May 2016
                    Win 8.1

                    Comment


                    • #11
                      Thank you very much Carole for both giving me fishing and teaching me how to fish

                      (au01: version 7.0 6 Nov 2014 12:13)
                      (notes
                      to accommodate using data's values)
                      (labels … already defined)
                      (File /media/share/lisdata/stata/au10ih.dta has been appended to file /media/share/lisdata/stata/au01ih.dta
                      file newfile_2.dta saved
                      (at04: version 7.0 13 Mar 2014 08:48)
                      (notes
                      to accommodate using data's values)
                      (labels … already defined)

                      File /media/share/lisdata/stata/at13ih.dta has been appended to file /media/share/lisdata/stata/at04ih.dta
                      file newfile_4.dta saved


                      Comment

                      Working...
                      X