Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Renaming the resulting _merge variable to include master and using files' names when merging files using loop

    Hello Statalist Members,
    Hope I can get some help, been trying to no avail.

    Problem:
    1. Need to merge 10 dta files (fileOne.dta, fileTwo.dta, ... fileTen.dta) to become 1 file
    2. Keeping the resulting _merge, renaming _merge to include merged file names, and then dummying the renamed _merge variable--to generate 3 additional dummy variables.

    the first merge 1:1, merges fileone.dta and filetwo.dta
    _merge will be renamed: _mergeOneTwo
    3 dummy variables: _mergeOneTwo1 (value is 1 if _merge==1), _mergeOneTwo2 (value is 1 if _merge==2), _mergeOneTwo3 (value is 1 if _merge==3)

    the second merge 1:1,merges the result of the first merge with filethree.dta
    the resulting _merge variable will need to be renamed _mergeOneTwoThree
    3 more dummy variables _mergeOneTwoThree1 (is 1 if _merge==1), _mergeOneTwoThree2 (is 1 if _merge==2), _mergeOneTwoThree3 (is 1 if _merge==3)

    and so on...

    Questions:
    1. How to rename _merge to include master & using file names within a loop ?
    2. How to dummy resulting renamed _merge within a loop?


    What I tried (and failed):

    the variable for matching among files: CheeseBrand

    Code:
    cd "C:\Users\HP\Desktop\doFileDraft"
    fs *.dta
    local filenames: dir "." files "*.dta"
    
    local first_file: word 1 of `filenames'
    local filenames: list filenames - first_file
    local fish : subinstr local file ".dta" ""
    
    
    use "`first_file'", clear
    tempfile building
    save `building'
    
    foreach f of local filenames {
    use `f', clear
    
    merge 1:1 CheeseBrand using `building'
    rename _merge _M`fish'
    tabulate _M`building'`fish', generate (_M`building'`fish')
    
    save `"`building'"', replace
    }
    Error message/s encountered:

    variable _MC not found
    r(111);

    end of do-file

    r(111);

    .
    I welcome any correction/ suggestion or redirection to accomplish this. Thank you

  • #2
    Your code is too cryptic as it defines too many locals. Also, -tab, gen()- will only create indicators for the observed levels of _merge, and how these are numbered may not be consistent if different levels of _merge are observed across combined datasets.

    Code:
    cd "C:\Users\HP\Desktop\doFileDraft"
    macro drop _all
    local filenames: dir "." files "*.dta"
    local filenames = ustrregexra(`"`filenames'"', "\b\.dta\b", "")
    tokenize `"`filenames'"'
    use `1', clear
    save file, replace
    forval i=2/`=wordcount(`"`filenames'"')'{
        local j= `i'-1
        use file, clear
        merge 1:1 CheeseBrand  using ``i''
        rename _merge _M``i''``j''
        forval k=1/3{
            gen _M``i''``j''`k'= _M``i''``j''==`k'
        }
        save file, replace
    }

    Comment


    • #3
      Woah, thank you Mr. Musau. Works like a next level magic !
      Do you mind telling what's going on a bit around forvalue i,j,k ?
      Many many thanks!

      Comment


      • #4
        From

        Code:
        help tokenize
        we have

        Description

        tokenize divides string into tokens, storing the result in `1', `2', ... (the positional local macros). Tokens are determined based on the parsing characters
        pchars, which default to a space if not specified.

        So I loop through the elements in the local "filenames". As these are stored in `1', `2', ..., following tokenize, \(i\) indexes the current element in the local filenames and \(j\) indexes the previous element. For \(k\), the variable _merge from

        Code:
        help merge
        has the levels

        1 master observation appeared in master only
        2 using observation appeared in using only
        3 match observation appeared in both

        so I create indicators for each of these levels.

        Comment


        • #5
          Thank you very much !

          Comment

          Working...
          X