Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Expanding datasets to all possible combinations of a variable

    Hi,

    I have a dataset containing 2 variables: id (unique identifier) and postal code. The table below gives an impression of the data:
    id postal code
    1 1111
    2 1112
    3 1113
    4 1114
    5 1115
    I would like to create a new expanded dataset which has every possible combination of id's and their respective postal codes. So, I would like to get something like this:
    fromid toid frompostalcode topostalcode
    1 2 1111 1112
    1 3 1111 1113
    1 4 1111 1114
    1 5 1111 1115
    2 1 1112 1111
    2 3 1112 1113
    2 4 1112 1114
    2 5 1112 1115
    3 1 1113 1111
    3 2 1113 1112
    3 4 1113 1114
    3 5 1113 1115
    Any ideas on how to do this? I did see some related questions, but in those instances people usually asked questions about pairwise combining datasets, or pairwise combining different variables, hence I could not figure out how to solve my problem.

    Thanks in advance.


  • #2
    Edited

    Here is one way

    Code:
    clear
    input id postal_code
    1 1111
    2 1112
    3 1113
    4 1114
    5 1115
    end
    
    tempfile tmp
    save "`tmp'"
    rename (id postal_code) (from_=)
    expand _N
    bysort from_postal_code : generate id = _n
    merge m:1 id using "`tmp'" , nogenerate
    rename (id postal_code) (to_id to_postal_code)
    drop if from_id==to_id
    
    order from_id to_id from_postal_code to_postal_code
    sort from_id to_id from_postal_code to_postal_code
    list , sepby(from_id)
    Best
    Daniel
    Last edited by daniel klein; 18 Apr 2019, 09:57. Reason: messed up the first time

    Comment


    • #3
      Hi Daniel,

      Thanks a lot! This was exactly what I was looking for! Works perfect.

      Best,
      Matthijs

      Comment

      Working...
      X