Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Syntax for using and merging multiple tempfiles

    I have been given a dataset (it is a translation table) to use that is set up in a fairly disorganized fashion. It has multiple variables that contain the same type of IDs (although the values in each vary slightly). I need to merge all of these together into one variable, while keeping the other variables. When I try to use the tempfiles I have saved I receive an invalid file specification error, I'm guessing this is just a syntax error on my part, but I have been unable to find a fix through googling so far. My code:

    tempfile baseDict
    save "`baseDict'"

    keep Main_ID
    tempfile MainID
    save "`MainID'"

    foreach year in 2012/2016 {
    use `baseDict'
    keep *_`year'
    rename ID_`year' Main_ID
    tempfile yearIDs
    save "`yearIDs'"

    use `MainID'
    merge 1:1 Main_ID using "`yearIDs'"
    save "`MainID'"
    }

    Set trace seems to show the error ocurring right as the first use `baseDict' runs in the for loop. Any help is quite appreciated, thank you!

  • #2
    Hi Nate,

    it is hard to follow your description without a proper data example. Please have a look at the FAQ (also linked at the top of each page here) to learn how to create and show example data using dataex.

    Anyways, I assume a data layout like this:
    Code:
    * set up sample data
    clear
    input Main_ID ID_2012 dataA_2012 dataB_2012 ID_2013 dataA_2013 dataB_2013
    1 4 41 42 2 23 24
    2 2 21 22 4 43 44
    3 1 11 12 3 33 34
    4 3 31 32 1 13 14
    end
    list
    Your code is quite close to what I would come up with to deal with that structure, however, it has a central flaw: You should use forvalues, not foreach, to iterate over a range of (year) integer numbers. I think it also would be more efficient to run the loop twice: First to create the year-based datasets, second to merge them all together. But this workflow might be a matter of style. Here's a modified version of your code that works for my made-up sample data:

    Code:
    * set up sample data
    clear
    input Main_ID ID_2012 dataA_2012 dataB_2012 ID_2013 dataA_2013 dataB_2013
    1 4 41 42 2 23 24
    2 2 21 22 4 43 44
    3 1 11 12 3 33 34
    4 3 31 32 1 13 14
    end
    list
    tempfile baseDict
    save "`baseDict'"
    
    // create main id dataset
    keep Main_ID
    tempfile MainID
    save "`MainID'"
    
    // create yearly datasets
    forvalues year=2012/2013 {
        use *_`year' using "`baseDict'"
        rename ID_`year' Main_ID
        tempfile `year'IDs
        save "``year'IDs'"
    }
    
    // open main id dataset, merge yearly datasets one after another
    use "`MainID'"
    forvalues year=2012/2013 {
        merge 1:1 Main_ID using "``year'IDs'" , nogenerate
    }
    
    // show result
    list
    Note that I modified the year range in the loop, as my sample data only contains information for 2012 and 2013. Hope this helps!

    Regards
    Bela

    Comment


    • #3
      Let me point out one subtle difference between the code in post #1 and the code in post #2. From post #1
      Code:
      save "`baseDict'"
      ...
      use `baseDict'
      and from post #2
      Code:
      save "`baseDict'"
      ...
      use *_`year' using "`baseDict'"
      Notice that post #2 includes quotation marks around the temporary dataset name in the use command. I'm guessing that the path to the directory where Stata creates temporary files includes a space, and the path must thus be surrounded in quotation marks so that the embedded space is not take to be the end of the path.

      This explains the narrow question of why the failure of the use command reported in post #1 occurred. It takes nothing away from the other advice given in post #2. In particular, consider
      Code:
      . foreach year in 2012/2016 {
        2. display "year is `year'"
        3. }
      year is 2012/2016
      
      . forvalues year = 2012/2016 {
        2. display "year is `year'"
        3. }
      year is 2012
      year is 2013
      year is 2014
      year is 2015
      year is 2016
      
      .

      Comment


      • #4
        Thank you both for your help!

        Comment

        Working...
        X