Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combining specific observations

    Hello everyone,

    I have a panel dataset containing trade data for 85 countries, across 16 years with around 12-15 variables (continuous and indicator). I wanted to combine observations for 10 specific countries (sum continuous variables' data, take the max value of the indicator variables) so as to create an observation for the group. For example, let A, B, X and Y be the sample of countries. Let A and B be the countries for which I need to merge data to create a country M. Presently, I have data which looks like this:

    Code:
    input year str2 exporter str2 importer exports contiguity
    2000 "A" "B" 10000 0
    2000 "A" "X" 12000 1
    2000 "A" "Y" 15000 0
    2000 "A" "A" . 0
    2000 "B" "X" 45000 0
    2000 "B" "Y" 34500 0
    2000 "B" "A" 35600 0
    2000 "B" "B" . 0
    2000 "X" "A" 57600 1
    2000 "X" "B" 25600 0
    2000 "X" "X" . 0
    2000 "X" "Y" 68700 0
    2000 "Y" "X" 23400 0
    2000 "Y" "Y" . 0
    2000 "Y" "A" 78700 0
    2000 "Y" "B" 76400 0
    end
    This is just for one year, there is data for 15 more years.
    I want it to look like this:
    Year Exporter Importer Export value Contiguity
    2000 M X sum of A to X & B to X (i.e. 12000+45000) 1
    2000 M Y sum of A to Y & B to Y 0
    2000 M M sum of A to B & B to A 0
    2000 X M sum of X to A & X to B 1
    2000 Y M sum of Y to A & Y to B 0
    I experimented with collapse and egen, but couldn't figure out a way. Does anyone have any ideas about what I could try?

    Many thanks,
    Meemansa

  • #2
    Thanks for providing an easy to work with data example (for those who might want to do the same, see dataex from SSC).

    There are several ways to approach this. I recommend building a mapping of countries you want to group and use merge to update the data. It's a bit tedious but is more flexible if you have many countries to combine. Once the exporter importer codes have been updated, you just group observations and calculate what you need.

    Code:
    clear
    input year str2 exporter str2 importer exports contiguity
    2000 "A" "B" 10000 0
    2000 "A" "X" 12000 1
    2000 "A" "Y" 15000 0
    2000 "A" "A" . 0
    2000 "B" "X" 45000 0
    2000 "B" "Y" 34500 0
    2000 "B" "A" 35600 0
    2000 "B" "B" . 0
    2000 "X" "A" 57600 1
    2000 "X" "B" 25600 0
    2000 "X" "X" . 0
    2000 "X" "Y" 68700 0
    2000 "Y" "X" 23400 0
    2000 "Y" "Y" . 0
    2000 "Y" "A" 78700 0
    2000 "Y" "B" 76400 0
    end
    tempfile main
    save "`main'"
    
    * make a list of countries to merge
    clear
    input str1 (oldc newc)
    A M
    B M
    end
    isid oldc
    tempfile cmapping
    save "`cmapping'"
    
    * update the exporter country using this mapping 
    rename oldc exporter
    merge 1:m exporter using "`main'", keep(match using) nogen
    replace exporter = newc if !mi(newc)
    drop newc
    save "`main'", replace
    
    * update the importer country using the mapping 
    use "`cmapping'"
    rename oldc importer
    merge 1:m importer using "`main'", keep(match using) nogen
    replace importer = newc if !mi(newc)
    drop newc
    
    * Now you can group observations by the new codes. 
    order year exporter importer
    bysort year exporter importer: egen sexports = total(exports)
    bysort year exporter importer: egen scont = total(contiguity)
    list, sepby(year exporter importer)
    
    * You can also use collapse
    collapse (sum) exports contiguity, by(year exporter importer)
    list

    Comment


    • #3
      This helped a lot. Thank you very much!

      Comment

      Working...
      X