Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Appending Files Conditional on File Name

    I have a number of files that are named the following:

    lp1_x0_rf1.dta
    ...
    lp1_x9_rf1.dta
    lp1_x0_rf2.dta
    ...
    lp1_x9_rf2.dta
    ...
    lp1_x9_rf9.dta
    lp2_x0_rf1.dta
    ...
    lp2_x9_rf1.dta
    lp2_x0_rf2.dta
    ...
    lp2_x9_rf2.dta
    ...
    lp2_x9_rf9.dta

    Just in case the naming convention is not clear, they all begin with lp and then 1 or 2, followed by x and then 0 through 9. Finally is rf and then 0 through 9.

    I would like to append lp1_x0_rf1.dta to lp2_x0_rf1.dta and save it as a new file. I am struggling to write a loop that automates this, appending each lp1 file to each lp2 file conditional on them having the same x and rf values. As of now, I have a loop that appends every lp2* file to each lp1* file.

    loc fileList1 : dir "`root'" files "lp1*"
    foreach file1 in `fileList1' {
    use "`root'/`file1'", clear

    loc fileList2 : dir "`root'" files "lp2*"
    foreach file2 in `fileList2' {
    append using "`root'/`file2'", force
    }

    loc ext = substr("`file1'", 5, 2)
    loc sat = substr("`file1'", 8, 3)
    save "`export'/lp_`ext'_`sat'.dta", replace
    }

    If I could use the -if- qualifier with the -append- command, I feel like the problem would be easy. As it is, I cannot figure out how to get Stata to examine the components of two file names and then append them if those components are equal to each other.

    Thanks for any suggestions!

  • #2
    I believe you simply need the if *command* as opposed to the if *qualifier.* See -help ifcmd-.

    I'm not sure I entirely understand your conditions, but I think you want something like this:
    Code:
    // In your example, it appears the relevant numbers appear at positions 6 and 10 in the file names
    if  (substr("`file1'", 6, 1) == substr("`file2'", 6, 1) )  & (substr("`file1'", 10, 1) == substr("`file2'", 10, 1) )  {
       append .....
    }

    Comment


    • #3
      Is it the case that you have 10 x 9 = 90 pairs of datasets - lp1 and lp2 for each combination of x0 through x9 and rf1 through rf9?

      if so, the following approach might simplify your code.
      Code:
      forvalues i=0/9 {
      forvalues j=1/9 {
      use          "`root'/lp1_x`i'_rf_`j'", clear
      append using "`root'/lp2_x`i'_rf_`j'", force
      save "`export'/lp_x`i'_rf`j'", replace
      }
      }
      Last edited by William Lisowski; 13 Jul 2019, 09:35.

      Comment


      • #4
        Thank you Mike, that worked like a charm!

        Comment

        Working...
        X