Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create dummy variable based on other dataset

    Dear all,

    I have three datasets containing, among other, the following variables:
    - dataset_1: id_1, which is different for every obs
    - dataset_2: id_2, which is different for every obs, cat_2 which can be a, b or c, and date_2 which is a date
    - dataset_3: id_1 and id_2, with repeated values, although not covering every value of id_1 and id_2 from the other datasets.

    In dataset_1 I want to create a dummy variable that equals 1 when there is an observation in dataset_3 that links the obs in dataset_1 to an obs in dataset_2 (as in, for example, for the third obs in dataset_3, id_1=3 and id_2=7, so those observations are linked) and for the linked obs in dataset_2 it is true that: cat_2 = a AND date_2 is after a fixed date (which is fixed for the entire dataset_1).

    After that, I will also create other dummies based on the constellation of cat_1 and date_1.

    Thank you in advance for your help.

    Best regards,
    Marco

  • #2
    So something like this:

    Code:
    use dataset_1
    gen is_match= .
    merge 1:m id_1 using dataset_3, keep(master match) keepusing(id_2)
    replace is_macth= 0 if _merge == 1
    drop _merge
    merge m:1 id_2 using dataset_2, keep(master match) keepusing(cat_2 date_2)
    replace is_match= (_merge == 3)  if missing(is_match) 
    drop _merge
    replace is_match = 0 if cat_2 != "a" | date_2 < fixed_date
    by id_1, sort: egen has_match = max(is_match)
    drop is_match
    by id_1 (is_match): keep if _n == _N
    Notes: This will do what you asked for. I worry that it is not really what you want. What if there is more than one observation in dataset_2 that fulfills your criteria? All you have from this code is an indication that a match can or cannot be found--but you will not know how many. Moreover, the values of cat_2 and date_2 that are left behind after this code would represent only one of those observations, and an arbitrary one at that.

    Comment


    • #3
      Thank you Clyde, I think that's exactly what I want to do, still thinking around a bit though

      Comment

      Working...
      X