Repost: Matching/Partnering in Stata

Toni Moreno

Join Date: Jun 2018

Posts: 19
#1

Repost: Matching/Partnering in Stata

28 Jun 2018, 02:32

Dear all,

I got the recommendation to specify the question of my former post and give more detailed information on my data and code.

What I am trying to do is to match potential partners. So I want to partner individuals who have similar characteristics.

I use a panel dataset which looks like that for a specific year (here 1985):

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input pid syear sex age edu_highest married part_age part_edu_highest partid 1 1985 1 45 2 1 42 3 2 2 1985 2 42 3 1 45 2 1 3 1985 2 24 3 1 27 4 . 4 1985 2 45 2 0 -2 -2 -2 5 1985 1 54 1 1 51 2 . end label values sex sex label def sex 1 "[1] male", modify label def sex 2 "[2] female", modify

In an earlier step I imputed partners age (part_age) and partners educational level by chained imputation.

Now I want to find a likely partner from the dataset for an individual who is married in 1985, but has no partner id (for example pid number 3 or 5).
There is no need for the matching to be unique, so we can assign one individual to be a "artificial partner" to more than one individual. Also, it does not matter whether the potential partner is single or already in a relationship.

I tried to solve the problem using psmatch2. Since I want to partner individuals of different sex, I thought about doing the partnering for women and men separately. So I generated the variables `gender`_age and ´gender'_edu_highest to be able to differentiate between men and women.

In the next step I tried to run:

Code:

psmatch2 treat if syear==2005, mahalanobis (Female_age Female_education) neighbor(1)

Where treat determines the sex of the individual, so that we match a male individual with a female individual.

But the values of _id and _n1 do not really make sense or i can not identify matched partners.

Is there maybe a more straightforward way to deal with this problem of matching/ partnering in Stata?

Any help would be appreciated. Thank you very much in advance,
Toni
Tags: None

Mike Lacy

Join Date: Apr 2014
Posts: 2411

28 Jun 2018, 08:16

If I understand correctly, I don't think that programs like -psmatch2- are relevant. I would approach your problem using -cross-, which creates a data file of all possible pairs, from which you can select the pairs of interest. The following should get you started.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input pid syear sex age edu_highest married part_age part_edu_highest     partid
        1  1985   1   45      2       1         42        3                  2
        2  1985   2   42      3       1         45        2                  1
        3  1985   2   24      3       1         27        4                   .
        4  1985   2   45      2       0         -2        -2                 -2
        5  1985   1   54      1       1         51        2                   .
end
label values sex sex
label def sex 1 "[1] male", modify
label def sex 2 "[2] female", modify
//
// Create a copy of the current file with changed variable names
// These "alters" will be paired with individuals in the current file
preserve
rename * *_alter
tempfile temp
save `temp'
restore
cross using `temp'
//  Examine what you have
list pid pid_alter
// Everyone is paired with everyone now.  Drop some irrelevant pairs.
drop if (pid == pid_alter)  // no self self
// drop any other irrelevant pairs?
// I don't know what counts as a good match for you, but here's an example
gen byte match = (syear == syear_alter) & (sex != sex_alter) & (abs(age - age_alter) <= 5)

Last edited by Mike Lacy; 28 Jun 2018, 08:21.

Comment

Romalpa Akzo

Join Date: Oct 2017

Posts: 369
#3

29 Jun 2018, 02:21

My understanding is different from Mike's. Particularly, (if I might be correct), what Toni want is trying to capture partnerID if the partner informations (age sex edu) are exactly matching with such info of the relevant observation. There may be cases where matching are found in many observations, followingly the partnerID could not be exactly deducted, but just "potential" guess.

The below code lists down all potential partner IDs for each observation if its partid is missing.

Code:

expand 2, gen(ex) replace sex = 3-sex if ex replace age = part_age if ex replace edu_highest= part_edu_highest if ex gen potential="" bys syear sex age edu_highest (ex): replace potential=string(pid)+" " + potential[_n-1] if ex bys syear sex age edu_highest (ex): replace potential=potential[_N] if ex==0 & partid==. drop if ex split potential, destring drop ex potential
Comment
Toni Moreno

Join Date: Jun 2018

Posts: 19
#4

29 Jun 2018, 03:08

Thank you very much for your help Mike and Romalpa.

And yes, I want to assign the matched partners personal ID as an individuals partner ID if its partner information are missing.

Since I also thought about including more variables to match on (like number of children, region, ...) I just thought that Nearest Neighbor Matching (with the help of Mahalanobis distance) might be a good approach.
Comment

Announcement

Repost: Matching/Partnering in Stata

Comment

Comment

Comment