Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • question about foreacc/forvalues

    Hi everbody

    I have 100 datasets called database1 database2 database3 ... , and I have to make a manytimes merge: database1 merge with database2 and save; database2 merge with database3 and save .....and the same for all datasets. In the end I must have 99 new databases. I canĀ“t programming foreach or forvalues correctly for the task


    thanks

  • #2
    Something like this might work.

    Code:
    forvalues i=1/99 {
    use "database`i'", clear //This line needs to be updated for your situation merge 1:1 varlist using "database`=i+1'" save "newdatabase`i'"
    }
    Unfortunately, I don't know what type of merge you want to do or what variables you want to merge on, so the merge command line definitely needs to be updated.

    Comment


    • #3
      Thank you very much Roger, it is what I needed, the 100 databases is about monthly wage of people and I am trying to measure monthly wage changes

      Comment


      • #4
        I think #2 is not quite right. I don't think it accomplishes Jimmy Esc's goal: he will end up with 99 databases, one of which is the combined contents of databases 1 and 2, the next combines databases 2 and 3, the next databases 3 and 4, etc. I understood him to want a single database combining all 100 databases. So the code would be something like:

        Code:
        clear
        tempfile building
        save `building', emptyok
        use database_1, clear
        
        forvalues i = 2/100 {
            merge 1:1 person_id using database_`i'
            // PERHAPS SOME CODE IN HERE TO RENAME VARIABLES OR MAKE
            // OTHER NEEDED ADJUSTMENTS, ETC.
            drop _merge
            save `"`building'"', replace
        }
        
        use `building', clear
        // ANY FINAL DATA CLEANING HERE
        save my_combined_data_file, replace
        Obviously modify filenames and variable names as needed to match your data.

        That said, it is not obvious to me that -merge-ing these data sets is the best way to go. If each data set contains a month's data and the goal is to study monthly wage changes, you would be best off working with a file that is in long layout, where each person has a separate observation for each month. -merge- will not get you there: it produces a wide layout. So to get a long layout you want to -append- rather than -merge- in the loop:

        Code:
        clear
        tempfile building
        save `building', emptyok
        
        forvalues i = 1/100 {
            use database_`i', clear
            // WHATEVER CODE NEEDED TO ADJUST VARIABLE NAMES
            // OR DO OTHER CLEANING/MODFICATIONS.
            append using `building'
            save `"`building'"', replace
        }
        
        // FROM THIS POINT ON, THE CODE IS THE SAME AS SHOWN IN THE PREVIOUS CODE BLOCK

        Comment

        Working...
        X