Peers of peers dataset

Nicola Pensiero

Join Date: Aug 2021

Posts: 6
#1

Peers of peers dataset

28 Oct 2023, 10:43

Dear all,
using an individuals nested in schools dataset, I am trying to create a dataset with the peers or peers which meet the following condition: the primary school (ks2) peers’ of the secondary school (ks4) peers, who attended a different primary school than that of the individual of interest.

Code:

* Example generated by -dataex-. For more info, type help dataex clear input float(id ks2perf) str1(ks4_school_id ks2_school_id) 1 14 "a" "b" 2 11 "a" "b" 3 9 "a" "c" 4 17 "a" "c" 5 22 "a" "c" 6 1 "d" "c" 7 18 "d" "b" end

In this toy example case 6 is a relevant peer of peer for the peers 3, 4, 5, which in turn are the relevant peers for individuals 1 and 2.
How do I create an extra column that indicates if the peers of peers condition is met? I think this requires creating additional datasets with different variables names and then merging them, but my attempts do not seem to work.
Any help is much appreciated, Nic

Last edited by Nicola Pensiero; 28 Oct 2023, 10:45.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30153
#2

28 Oct 2023, 12:35

No, this cannot be done with -merge-'s. You need to pair up many with many, and -merge- cannot properly do that. (There is a -merge m:m-, but it does not do what is needed here; in fact it does not do anything you will ever want to do. Forget I mentioned it.) You need -joinby-. You also have to do some dancing with variable names to make this work. I believe the following gives what you want:

Code:

preserve rename id id2 keep id *_school_id tempfile schools save `schools' restore joinby ks4_school_id using `schools' drop if id == id2 preserve use `schools', clear rename id2 id3 rename ks4_school_id ks4_school_orig save `schools', replace restore joinby ks2_school_id using `schools' drop if inlist(id3, id, id2)| ks4_school_id == ks4_school_orig keep id id3 rename id3 peer_of_peer_id duplicates drop

If you are interested, you can stop this code before the -keep id id3- campaign and you can see the "path" through the peer "network" that establishes each of these peer-of-peer relationship. Many of these peers-of-peers achieve that status through multiple paths, hence the need for the -duplicates drop-. If you wish to bring back the original information about which schools each id attended and their value of k2sperf, you can now do that by -merge 1:1 id- using the original data.
Comment
Nicola Pensiero

Join Date: Aug 2021

Posts: 6
#3

28 Oct 2023, 16:46

thanks, indeed I was running in circles with -merge-. I did not know about - joinby - super useful. Thanks for providing the code Clyde. Best, Nic
Comment

Nicola Pensiero

Join Date: Aug 2021
Posts: 6

30 Oct 2023, 15:59

With relatively large datasets with thousands of individuals, the peers' databases produced with -joinby- become quickly huge. I have used to -collapse- to generate (hopefully) the same datasets.

Code:

use "xxx\cohort.dta", clear

rename ks2perf ks2perf_peers
rename ks2_school_id ks2_school_id_peers
collapse (mean) ks2perf_peers, by(ks4_school_id ks2_school_id_peers)
save "xxx\peers.dta", replace

use "xxx\cohort.dta", clear
rename ks4_school_id ks4_school_id_peersofpeers
rename ks2perf ks2perf_peersofpeers
collapse (mean) ks2perf_peersofpeers, by(ks4_school_id_peersofpeers ks2_school_id)

save "xxx\peersofpeers.dta", replace


use "xxx\cohort.dta", clear
joinby ks4_school_id using "xxx\peers.dta"

drop if ks2_school_id_peers==ks2_school_id

joinby ks2_school_id using "xxx\peersofpeers.dta"


g peersofpeers=1 if ks4_school_id!=ks4_school_id_peersofpeers

keep if peersofpeers==1

Announcement

Peers of peers dataset

Comment

Comment

Comment