Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Repost: Matching/Partnering in Stata

    Dear all,

    I got the recommendation to specify the question of my former post and give more detailed information on my data and code.

    What I am trying to do is to match potential partners. So I want to partner individuals who have similar characteristics.

    I use a panel dataset which looks like that for a specific year (here 1985):

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input pid syear sex age edu_highest married part_age part_edu_highest     partid 
            1  1985   1   45      2       1         42        3                  2
            2  1985   2   42      3       1         45        2                  1
            3  1985   2   24      3       1         27        4                   .
            4  1985   2   45      2       0         -2        -2                 -2
            5  1985   1   54      1       1         51        2                   .
    end
    label values sex sex
    label def sex 1 "[1] male", modify
    label def sex 2 "[2] female", modify
    In an earlier step I imputed partners age (part_age) and partners educational level by chained imputation.

    Now I want to find a likely partner from the dataset for an individual who is married in 1985, but has no partner id (for example pid number 3 or 5).
    There is no need for the matching to be unique, so we can assign one individual to be a "artificial partner" to more than one individual. Also, it does not matter whether the potential partner is single or already in a relationship.

    I tried to solve the problem using psmatch2. Since I want to partner individuals of different sex, I thought about doing the partnering for women and men separately. So I generated the variables `gender`_age and ´gender'_edu_highest to be able to differentiate between men and women.

    In the next step I tried to run:

    Code:
    psmatch2 treat if syear==2005, mahalanobis (Female_age Female_education) neighbor(1)
    Where treat determines the sex of the individual, so that we match a male individual with a female individual.

    But the values of _id and _n1 do not really make sense or i can not identify matched partners.


    Is there maybe a more straightforward way to deal with this problem of matching/ partnering in Stata?

    Any help would be appreciated. Thank you very much in advance,
    Toni

  • #2
    If I understand correctly, I don't think that programs like -psmatch2- are relevant. I would approach your problem using -cross-, which creates a data file of all possible pairs, from which you can select the pairs of interest. The following should get you started.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input pid syear sex age edu_highest married part_age part_edu_highest     partid
            1  1985   1   45      2       1         42        3                  2
            2  1985   2   42      3       1         45        2                  1
            3  1985   2   24      3       1         27        4                   .
            4  1985   2   45      2       0         -2        -2                 -2
            5  1985   1   54      1       1         51        2                   .
    end
    label values sex sex
    label def sex 1 "[1] male", modify
    label def sex 2 "[2] female", modify
    //
    // Create a copy of the current file with changed variable names
    // These "alters" will be paired with individuals in the current file
    preserve
    rename * *_alter
    tempfile temp
    save `temp'
    restore
    cross using `temp'
    //  Examine what you have
    list pid pid_alter
    // Everyone is paired with everyone now.  Drop some irrelevant pairs.
    drop if (pid == pid_alter)  // no self self
    // drop any other irrelevant pairs?
    // I don't know what counts as a good match for you, but here's an example
    gen byte match = (syear == syear_alter) & (sex != sex_alter) & (abs(age - age_alter) <= 5)
    Last edited by Mike Lacy; 28 Jun 2018, 08:21.

    Comment


    • #3
      My understanding is different from Mike's. Particularly, (if I might be correct), what Toni want is trying to capture partnerID if the partner informations (age sex edu) are exactly matching with such info of the relevant observation. There may be cases where matching are found in many observations, followingly the partnerID could not be exactly deducted, but just "potential" guess.

      The below code lists down all potential partner IDs for each observation if its partid is missing.
      Code:
      expand 2, gen(ex)
      replace sex = 3-sex if ex
      replace age = part_age if ex
      replace edu_highest= part_edu_highest if ex
      
      gen potential=""
      bys syear sex age edu_highest (ex): replace potential=string(pid)+" " + potential[_n-1] if ex
      bys syear sex age edu_highest (ex): replace potential=potential[_N] if ex==0 & partid==.
      drop if ex
      split potential, destring
      drop ex potential

      Comment


      • #4
        Thank you very much for your help Mike and Romalpa.

        And yes, I want to assign the matched partners personal ID as an individuals partner ID if its partner information are missing.

        Since I also thought about including more variables to match on (like number of children, region, ...) I just thought that Nearest Neighbor Matching (with the help of Mahalanobis distance) might be a good approach.

        Comment

        Working...
        X