nwcommands data compatibility - restructuring from wide format

Kasey Harding

Join Date: Oct 2020
Posts: 6

nwcommands data compatibility - restructuring from wide format

17 Nov 2020, 14:26

Hello.

I am looking for help with the nwcommands or other Stata network analysis options, particularly with structuring my data in a way that nwcommands will read. I have found the resources on the nwcommands very helpful for analysis, but I am working with a raw dataset that needs to be reformatted before it is compatible with these commands. I have a dataset of n=15 sociocentric networks formatted in a wide form – with ego and alter ties and attributes across one row (see sample data below). An ego can have between 0 and 20 alters and the edges between alters are undirected and weighted (1=strangers; 2=in-between; 3=especially close). My goal is to compute basic network structure characteristic variables for each network and summarize the network characteristics across the 15 networks, as well as map each network. I would appreciate any advice regarding how to reformat these data either to an edgelist or adjacency matrix – given that each observation in the current dataset represents a distinct network, I believe the solution involves 15 edgelists or matrices. To give a sense of the data, I have provided a sample of variables for 5 alters and created data for 2 observations/egos.

Example data:

ID_ego	node_a	node_b	node_c	node_d	node_e	sn_close_a	sn_close_b	sn_close_c	sn_close_d	sn_close_e	sn_relate_ab	sn_relate_ac	sn_relate_ad	sn_relate_ae	sn_relate_bc	sn_relate_bd	sn_relate_be	sn_relate_cd	sn_relate_ce	sn_relate_de
50	1	1	1	1	2	1	1	2	1	.	1	2	3	.	3	3	.	2	.	.
51	1	2	1	1	2	1	.	2	1	.	.	2	3	.	.	.	.	2	.	.

Variable description:

Variable	Obs	Unique	Min	Max	Label
ID_ego	15	15	50	65	unique id number for network ego
node_a	15	1	1	1	Keep=1 or remove =2 alter a
node_b	15	2	1	2	Keep=1 or remove =2 alter b
node_c	15	2	1	2	Keep=1 or remove =2 alter c
node_d	15	2	1	2	Keep=1 or remove =2 alter d
node_e	15	2	1	2	Keep=1 or remove =2 alter e
sn_close_a	15	2	1	2	Ego closeness with alter a (1=very close; 2=not as close)
sn_close_b	11	2	1	2	Ego closeness with alter b (1=very close; 2=not as close)
sn_close_c	10	2	1	2	Ego closeness with alter c (1=very close; 2=not as close)
sn_close_d	9	2	1	2	Ego closeness with alter d (1=very close; 2=not as close)
sn_close_e	7	2	1	2	Ego closeness with alter e (1=very close; 2=not as close)
sn_relate_ab	11	2	2	3	relationship of alter a & b (1=strangers; 2=in-between; 3=very close)
sn_relate_ac	10	3	1	3	relationship of alter a & c (1=strangers; 2=in-between; 3=very close)
sn_relate_ad	9	3	1	3	relationship of alter a & d (1=strangers; 2=in-between; 3=very close)
sn_relate_ae	7	3	1	3	relationship of alter a & e (1=strangers; 2=in-between; 3=very close)
sn_relate_bc	8	2	2	3	relationship of alter b & c (1=strangers; 2=in-between; 3=very close)
sn_relate_bd	7	3	1	3	relationship of alter b & d (1=strangers; 2=in-between; 3=very close)
sn_relate_be	5	3	1	3	relationship of alter b & e (1=strangers; 2=in-between; 3=very close)
sn_relate_cd	9	3	1	3	relationship of alter c & d (1=strangers; 2=in-between; 3=very close)
sn_relate_ce	7	3	1	3	relationship of alter c & e (1=strangers; 2=in-between; 3=very close)
sn_relate_de	7	2	2	3	relationship of alter d & e (1=strangers; 2=in-between; 3=very close)

Tags: None

Kasey Harding

Join Date: Oct 2020

Posts: 6
#2

17 Nov 2020, 14:51

clear
input long ID_ego node_a node_b node_c node_d node_e sn_close_a sn_close_b sn_close_c sn_close_d sn_close_e sn_relate_ab sn_relate_ac sn_relate_ad sn_relate_ae sn_relate_bc sn_relate_bd sn_relate_be sn_relate_cd sn_relate_ce sn_relate_de
50 1 1 1 1 2 1 1 2 1 . 1 2 3 . 3 3 . 2 . .
51 1 2 1 1 2 1 . 2 1 . . 2 3 . . . . 2 . .
end
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2416
#3

17 Nov 2020, 16:26

This will involve one or more -reshape- commands to get to a long (edge) format, but it's not quite clear to me what you want in the end. I feel confident you would want a file of edges between each ego and each of their alters, but do you also want edges between each pair of alters for each ego? Providing an example result data set, possibly for a simplified version of what you have above, would facilitate a good answer.
1 like
Comment
Kasey Harding

Join Date: Oct 2020

Posts: 6
#4

18 Nov 2020, 16:42

Thank you so much for the feedback, Mike. The edgelist that I would want to create would include an undirected edge between ego and each her alters and an edge between pairs of alters. I also would like to include a weight to the edge (not close to close). As you suggested, I have provided an example of what that edgelist might look like for ID_ego 50. One thing I do not know how to do is to efficiently create this edgelist for each ego. From the nwcommends example datasets (e.g., hpbooks) it seems like I should reformate and store these edgelists somehow. I am also not sure from a stata perspective how to attach an attributes list to each network. Again, based on nwcommands example data, I know this can be done, but I could not find the examples of how to format or store the data correctly.

input name1 name2 weight
ego a 2
ego b 2
ego c 1
ego d 2
a b 0
a c 1
a d 2
b c 2
b d 2
c d 1
end
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2416
#5

18 Nov 2020, 20:29

I have something I think is close to what you want. Some comments:
1) Your example "result" data doesn't quite correspond to your input data (e.g., you show the a b weight as 0, but sn_relate for ego = 50 is 1) so I only used your "result" example as a structural illustration.
2) The names for alters will have to be something other than a, b, c, d, e, since "a" for ego = 50 is in general a different person than "a" for ego = 51. Consequently, I renamed the alters by adding on the name of the ego to which they belong.
3) I'm an admirer of -nwcommands-, and one of a relative few people here on StataList who has some interest in it, but I have hardly used it. So, while I think that an edge list with values or weights describing each tie will do what you want, I'm not sure. And, I don't know what you mean by "attaching an attributes list to each network," so I'm not able to be helpful there.
4) Just as I was about to post this, I realized that perhaps you only want to handle one ego at a time, which would make this easier and somewhat different, but I don't have time to go there right now.

So, on those understandings and possible shortcomings, take a look at this, and see if it looks helpful.

Code:

clear input long ID_ego node_a node_b node_c node_d node_e sn_close_a sn_close_b sn_close_c sn_close_d sn_close_e sn_relate_ab sn_relate_ac sn_relate_ad sn_relate_ae sn_relate_bc sn_relate_bd sn_relate_be sn_relate_cd sn_relate_ce sn_relate_de 50 1 1 1 1 2 1 1 2 1 . 1 2 3 . 3 3 . 2 . . 51 1 2 1 1 2 1 . 2 1 . . 2 3 . . . . 2 . . end // drop node* // These variables don't seem to carry any information // Divide and conquer: Make the ego/alter file first. preserve keep ID_ego sn_close* // just the ego-alter variables // ego name needs to be string for compatibility tostring ID_ego, gen(name1) drop ID_ego reshape long sn_close_, i(name1) j(name2) string drop if missing(sn_close_) // no relationship rename sn_close_ weight tempfile egoalter save `egoalter' list restore // // Now create the alter-alter file. keep ID_ego sn_relate* reshape long sn_relate_, i(ID_ego) j(pair) string drop if missing(sn_relate_) gen name1 = substr(pair, 1, 1) + "_" + strofreal(ID_ego) gen name2 = substr(pair, 2, 1) + "_" + strofreal(ID_ego) rename sn_relate weight keep name1 name2 weight order name1 name2 weight // Put the files together. append using `egoalter'
Comment

Mike Lacy

Join Date: Apr 2014
Posts: 2416

19 Nov 2020, 09:30

Having now re-thought/reread this with a clearer head, I reworked the preceding to make a separate -nwcommands- network for each ID_ego, which I now realize is what you wanted. There is probably a way to do this somewhat more efficiently, but the following approach was clearer and easier for me. There are two key simplifying features below: 1) Break the original file down into separate files for each ID_ego; and 2) For each ID_ego, handle the ego-alter and alter-alter variables separately.

Code:

clear
input long ID_ego node_a node_b node_c node_d node_e sn_close_a sn_close_b sn_close_c sn_close_d sn_close_e sn_relate_ab sn_relate_ac sn_relate_ad sn_relate_ae sn_relate_bc sn_relate_bd sn_relate_be sn_relate_cd sn_relate_ce sn_relate_de
50 1 1 1 1 2 1 1 2 1 . 1 2 3 . 3 3 . 2 . .
51 1 2 1 1 2 1 . 2 1 . . 2 3 . . . . 2 . .
end
//
nwclear // fresh -nwcommands- workspace
drop node* // These variables don't seem to carry any information
// Make a separate tempfile for each ego's observation.
levelsof ID_ego, local(egolist)
foreach id of local egolist {
   preserve
   keep if ID_ego == `id'
   tempfile file`id'
   save `file`id''
   restore
}
// Now we can deal with each ego one at a time.
// The strategy is "divide and conquer," i.e., handle ego-alter
// and alter-alter relationships separately, then put them
// together into a composite edge list
tempfile egoalter
foreach id of local egolist {
   use `file`id'', clear
   // ego-alter
    preserve
    keep ID_ego sn_close*  
    rename ID_ego name1
    tostring name1, replace
    reshape long sn_close_, i(name1) j(name2) string
    drop if missing(sn_close_) // no relationship
    rename sn_close_ weight
    save `egoalter', replace //
    restore
    // alter-alter
    keep ID_ego sn_relate*
    reshape long sn_relate_, i(ID_ego) j(pair) string
    drop if missing(sn_relate_)
    gen name1 = substr(pair, 1, 1)
    gen name2 = substr(pair, 2, 1)
    rename sn_relate weight
    keep name1 name2 weight
    order name1 name2 weight
    // Put the files together.
    append using `egoalter'
    di "Edge list for ID_ego = `id'"
    list //
    // Make an -nwcommands- network of this dataset
    nwfromedge name1 name2 weight, name(Network`id') undirected
}    
// List of networks created
nwds *

Comment

Kasey Harding

Join Date: Oct 2020

Posts: 6
#7

19 Nov 2020, 15:25

Mike, Thank you. This is great and exactly what I was looking for. The 1, 2, and 3 weights for edges in my example have different meanings in the ego-alter vs. alter-alter relationships, so I will recode those weights in the data before running the code you shared.

I see how you dropped the node_* vars, which would have been my choice variable for dropping the missing alters. However, you did that with the sn_close var, which works perfectly well and is probably more streamlined that what I was attempting.

I also liked the approach to weight variables. If I wanted to include additional characteristics of the alters, could I do that by adding them to the keep and order statements? (e.g. primary identify of alter: 1=family, 2=friend, 3=co-worker, etc.) I would like to color and size the nodes by attributes.

Thanks for again for your brilliant help with this.
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2416
#8

19 Nov 2020, 17:05

Kasey, yes, you should be able to keep some more variables with -keep- (-order- is just cosmetic here, by the way). If that isn't so workable, they could always be -merged- on by some identifier(s) for ego and alter after the file is in long format. I'm not sure, though, whether the networkinggraphing capabilities of -nwcommands- allow for colors etc, as I've never tried that.

I dropped the node* variables first just to have the simplest file to work with. You could keep them through the reshape, but my thinking is generally not to keep any extra variables.
Comment
Kasey Harding

Join Date: Oct 2020

Posts: 6
#9

19 Nov 2020, 18:12

Mike, Thanks for the clarification about the keep & order and about simplifying the file. If I find anything out with visualizing node attributes, I'll be sure to post about it.
Comment
Kasey Harding

Join Date: Oct 2020

Posts: 6
#10

19 Nov 2020, 18:57

Yes, you can color code the nodes.
nwplot, color(Birthplace)
nwplot, color(Birthplace) colorpalette(red yellow cyan brown)

ref: https://nwcommands.wordpress.com/demo_nwplot/
Comment
Nicholas Barr

Join Date: Feb 2021

Posts: 1
#11

23 Feb 2021, 18:15

Hello, I am attempting to specify color palette options but the specified colors are not reflected in the plot using this code:
nwplot mypath, size(fightsum, forcekeys(6 12 18 24)) color(ffmqcat) colorpalette(red yellow cyan)

I would appreciate any insight you might have!
Comment

Announcement

nwcommands data compatibility - restructuring from wide format

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment