Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • nwcommands data compatibility - restructuring from wide format

    Hello.

    I am looking for help with the nwcommands or other Stata network analysis options, particularly with structuring my data in a way that nwcommands will read. I have found the resources on the nwcommands very helpful for analysis, but I am working with a raw dataset that needs to be reformatted before it is compatible with these commands. I have a dataset of n=15 sociocentric networks formatted in a wide form – with ego and alter ties and attributes across one row (see sample data below). An ego can have between 0 and 20 alters and the edges between alters are undirected and weighted (1=strangers; 2=in-between; 3=especially close). My goal is to compute basic network structure characteristic variables for each network and summarize the network characteristics across the 15 networks, as well as map each network. I would appreciate any advice regarding how to reformat these data either to an edgelist or adjacency matrix – given that each observation in the current dataset represents a distinct network, I believe the solution involves 15 edgelists or matrices. To give a sense of the data, I have provided a sample of variables for 5 alters and created data for 2 observations/egos.

    Example data:
    ID_ego node_a node_b node_c node_d node_e sn_close_a sn_close_b sn_close_c sn_close_d sn_close_e sn_relate_ab sn_relate_ac sn_relate_ad sn_relate_ae sn_relate_bc sn_relate_bd sn_relate_be sn_relate_cd sn_relate_ce sn_relate_de
    50 1 1 1 1 2 1 1 2 1 . 1 2 3 . 3 3 . 2 . .
    51 1 2 1 1 2 1 . 2 1 . . 2 3 . . . . 2 . .
    Variable description:
    Variable Obs Unique Min Max Label
    ID_ego 15 15 50 65 unique id number for network ego
    node_a 15 1 1 1 Keep=1 or remove =2 alter a
    node_b 15 2 1 2 Keep=1 or remove =2 alter b
    node_c 15 2 1 2 Keep=1 or remove =2 alter c
    node_d 15 2 1 2 Keep=1 or remove =2 alter d
    node_e 15 2 1 2 Keep=1 or remove =2 alter e
    sn_close_a 15 2 1 2 Ego closeness with alter a (1=very close; 2=not as close)
    sn_close_b 11 2 1 2 Ego closeness with alter b (1=very close; 2=not as close)
    sn_close_c 10 2 1 2 Ego closeness with alter c (1=very close; 2=not as close)
    sn_close_d 9 2 1 2 Ego closeness with alter d (1=very close; 2=not as close)
    sn_close_e 7 2 1 2 Ego closeness with alter e (1=very close; 2=not as close)
    sn_relate_ab 11 2 2 3 relationship of alter a & b (1=strangers; 2=in-between; 3=very close)
    sn_relate_ac 10 3 1 3 relationship of alter a & c (1=strangers; 2=in-between; 3=very close)
    sn_relate_ad 9 3 1 3 relationship of alter a & d (1=strangers; 2=in-between; 3=very close)
    sn_relate_ae 7 3 1 3 relationship of alter a & e (1=strangers; 2=in-between; 3=very close)
    sn_relate_bc 8 2 2 3 relationship of alter b & c (1=strangers; 2=in-between; 3=very close)
    sn_relate_bd 7 3 1 3 relationship of alter b & d (1=strangers; 2=in-between; 3=very close)
    sn_relate_be 5 3 1 3 relationship of alter b & e (1=strangers; 2=in-between; 3=very close)
    sn_relate_cd 9 3 1 3 relationship of alter c & d (1=strangers; 2=in-between; 3=very close)
    sn_relate_ce 7 3 1 3 relationship of alter c & e (1=strangers; 2=in-between; 3=very close)
    sn_relate_de 7 2 2 3 relationship of alter d & e (1=strangers; 2=in-between; 3=very close)

  • #2
    clear
    input long ID_ego node_a node_b node_c node_d node_e sn_close_a sn_close_b sn_close_c sn_close_d sn_close_e sn_relate_ab sn_relate_ac sn_relate_ad sn_relate_ae sn_relate_bc sn_relate_bd sn_relate_be sn_relate_cd sn_relate_ce sn_relate_de
    50 1 1 1 1 2 1 1 2 1 . 1 2 3 . 3 3 . 2 . .
    51 1 2 1 1 2 1 . 2 1 . . 2 3 . . . . 2 . .
    end

    Comment


    • #3
      This will involve one or more -reshape- commands to get to a long (edge) format, but it's not quite clear to me what you want in the end. I feel confident you would want a file of edges between each ego and each of their alters, but do you also want edges between each pair of alters for each ego? Providing an example result data set, possibly for a simplified version of what you have above, would facilitate a good answer.

      Comment


      • #4
        Thank you so much for the feedback, Mike. The edgelist that I would want to create would include an undirected edge between ego and each her alters and an edge between pairs of alters. I also would like to include a weight to the edge (not close to close). As you suggested, I have provided an example of what that edgelist might look like for ID_ego 50. One thing I do not know how to do is to efficiently create this edgelist for each ego. From the nwcommends example datasets (e.g., hpbooks) it seems like I should reformate and store these edgelists somehow. I am also not sure from a stata perspective how to attach an attributes list to each network. Again, based on nwcommands example data, I know this can be done, but I could not find the examples of how to format or store the data correctly.

        input name1 name2 weight
        ego a 2
        ego b 2
        ego c 1
        ego d 2
        a b 0
        a c 1
        a d 2
        b c 2
        b d 2
        c d 1
        end

        Comment


        • #5
          I have something I think is close to what you want. Some comments:
          1) Your example "result" data doesn't quite correspond to your input data (e.g., you show the a b weight as 0, but sn_relate for ego = 50 is 1) so I only used your "result" example as a structural illustration.
          2) The names for alters will have to be something other than a, b, c, d, e, since "a" for ego = 50 is in general a different person than "a" for ego = 51. Consequently, I renamed the alters by adding on the name of the ego to which they belong.
          3) I'm an admirer of -nwcommands-, and one of a relative few people here on StataList who has some interest in it, but I have hardly used it. So, while I think that an edge list with values or weights describing each tie will do what you want, I'm not sure. And, I don't know what you mean by "attaching an attributes list to each network," so I'm not able to be helpful there.
          4) Just as I was about to post this, I realized that perhaps you only want to handle one ego at a time, which would make this easier and somewhat different, but I don't have time to go there right now.


          So, on those understandings and possible shortcomings, take a look at this, and see if it looks helpful.
          Code:
          clear
          input long ID_ego node_a node_b node_c node_d node_e sn_close_a sn_close_b sn_close_c sn_close_d sn_close_e sn_relate_ab sn_relate_ac sn_relate_ad sn_relate_ae sn_relate_bc sn_relate_bd sn_relate_be sn_relate_cd sn_relate_ce sn_relate_de
          50 1 1 1 1 2 1 1 2 1 . 1 2 3 . 3 3 . 2 . .
          51 1 2 1 1 2 1 . 2 1 . . 2 3 . . . . 2 . .
          end
          //
          drop node* // These variables don't seem to carry any information
          // Divide and conquer: Make the ego/alter file first.
          preserve
          keep ID_ego sn_close*  // just the ego-alter variables
          // ego name needs to be string for compatibility
          tostring ID_ego, gen(name1)
          drop ID_ego
          reshape long sn_close_, i(name1) j(name2) string
          drop if missing(sn_close_) // no relationship
          rename sn_close_ weight
          tempfile egoalter
          save `egoalter'
          list
          restore
          //
          // Now create the alter-alter file.
          keep ID_ego sn_relate*
          reshape long sn_relate_, i(ID_ego) j(pair) string
          drop if missing(sn_relate_)
          gen name1 = substr(pair, 1, 1) + "_" + strofreal(ID_ego)
          gen name2 = substr(pair, 2, 1) + "_" + strofreal(ID_ego)
          rename sn_relate weight
          keep name1 name2 weight
          order name1 name2 weight
          // Put the files together.
          append using `egoalter'

          Comment


          • #6
            Having now re-thought/reread this with a clearer head, I reworked the preceding to make a separate -nwcommands- network for each ID_ego, which I now realize is what you wanted. There is probably a way to do this somewhat more efficiently, but the following approach was clearer and easier for me. There are two key simplifying features below: 1) Break the original file down into separate files for each ID_ego; and 2) For each ID_ego, handle the ego-alter and alter-alter variables separately.

            Code:
            clear
            input long ID_ego node_a node_b node_c node_d node_e sn_close_a sn_close_b sn_close_c sn_close_d sn_close_e sn_relate_ab sn_relate_ac sn_relate_ad sn_relate_ae sn_relate_bc sn_relate_bd sn_relate_be sn_relate_cd sn_relate_ce sn_relate_de
            50 1 1 1 1 2 1 1 2 1 . 1 2 3 . 3 3 . 2 . .
            51 1 2 1 1 2 1 . 2 1 . . 2 3 . . . . 2 . .
            end
            //
            nwclear // fresh -nwcommands- workspace
            drop node* // These variables don't seem to carry any information
            // Make a separate tempfile for each ego's observation.
            levelsof ID_ego, local(egolist)
            foreach id of local egolist {
               preserve
               keep if ID_ego == `id'
               tempfile file`id'
               save `file`id''
               restore
            }
            // Now we can deal with each ego one at a time.
            // The strategy is "divide and conquer," i.e., handle ego-alter
            // and alter-alter relationships separately, then put them
            // together into a composite edge list
            tempfile egoalter
            foreach id of local egolist {
               use `file`id'', clear
               // ego-alter
                preserve
                keep ID_ego sn_close*  
                rename ID_ego name1
                tostring name1, replace
                reshape long sn_close_, i(name1) j(name2) string
                drop if missing(sn_close_) // no relationship
                rename sn_close_ weight
                save `egoalter', replace //
                restore
                // alter-alter
                keep ID_ego sn_relate*
                reshape long sn_relate_, i(ID_ego) j(pair) string
                drop if missing(sn_relate_)
                gen name1 = substr(pair, 1, 1)
                gen name2 = substr(pair, 2, 1)
                rename sn_relate weight
                keep name1 name2 weight
                order name1 name2 weight
                // Put the files together.
                append using `egoalter'
                di "Edge list for ID_ego = `id'"
                list //
                // Make an -nwcommands- network of this dataset
                nwfromedge name1 name2 weight, name(Network`id') undirected
            }    
            // List of networks created
            nwds *

            Comment


            • #7
              Mike, Thank you. This is great and exactly what I was looking for. The 1, 2, and 3 weights for edges in my example have different meanings in the ego-alter vs. alter-alter relationships, so I will recode those weights in the data before running the code you shared.

              I see how you dropped the node_* vars, which would have been my choice variable for dropping the missing alters. However, you did that with the sn_close var, which works perfectly well and is probably more streamlined that what I was attempting.

              I also liked the approach to weight variables. If I wanted to include additional characteristics of the alters, could I do that by adding them to the keep and order statements? (e.g. primary identify of alter: 1=family, 2=friend, 3=co-worker, etc.) I would like to color and size the nodes by attributes.

              Thanks for again for your brilliant help with this.

              Comment


              • #8
                Kasey, yes, you should be able to keep some more variables with -keep- (-order- is just cosmetic here, by the way). If that isn't so workable, they could always be -merged- on by some identifier(s) for ego and alter after the file is in long format. I'm not sure, though, whether the networkinggraphing capabilities of -nwcommands- allow for colors etc, as I've never tried that.

                I dropped the node* variables first just to have the simplest file to work with. You could keep them through the reshape, but my thinking is generally not to keep any extra variables.

                Comment


                • #9
                  Mike, Thanks for the clarification about the keep & order and about simplifying the file. If I find anything out with visualizing node attributes, I'll be sure to post about it.

                  Comment


                  • #10
                    Yes, you can color code the nodes.
                    nwplot, color(Birthplace)
                    nwplot, color(Birthplace) colorpalette(red yellow cyan brown)

                    ref: https://nwcommands.wordpress.com/demo_nwplot/

                    Comment


                    • #11
                      Hello, I am attempting to specify color palette options but the specified colors are not reflected in the plot using this code:
                      nwplot mypath, size(fightsum, forcekeys(6 12 18 24)) color(ffmqcat) colorpalette(red yellow cyan)

                      I would appreciate any insight you might have!

                      Comment

                      Working...
                      X