Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems about failing to use 'nwcommands' to create an edgelist

    I have a raw .csv file that has totally 168,812 edges(in 2 columns).
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str6(v1 v2)
    "001007" "350007"
    "001007" "070020"
    "001007" "261002"
    "001007" "000695"
    "001007" "519162"
    "001007" "519752"
    "001007" "620002"
    "001007" "001897"
    "001007" "166011"
    "001007" "002054"
    end
    When I typed
    Code:
    nwset v1 v2, edgelist undirected name(2017) keeporiginal
    it simply didn't create a node matrix. I don't know if there are too many obs for the program to deal with?
    How can I successfully create a matrix with this command?
    Sincerely thanks for those who notice this one.

  • #2
    To my knowledge, there are very few users of the user-written package -nwcommands- that participate in StataList. There's an email list for users on the page for the command, see https://nwcommands.wordpress.com/ , and you'd likely get better help there. I'm an admirer of the package, and have played with it, but I'm not an experienced user, so please take my advice in that context.

    First, I'd ask what makes you think -nwset- didn't create a node matrix? The way I'd check that is to 1) Use the bare -nwset- command to report your current networks; and 2) Check out whether a matrix has been created in Mata, where -nwcommands- puts its matrices: -mata mata desc-

    Second, I have some example code for you to try. It creates data and then tries to run -nwset-. For me, using -nwset- on an edgelist file of 10,000 distinct pairs ran fine in a few seconds. My casual experimentation makes me think that there could be problems with an edgelist file that has a lot of nodes, as that locked up my machine when I ran -nwset-. In experimenting with what's below, you might start by putting in 1000 nodes, then try 2000 if that works, 5000, etc. It would also be good to set the local maxalter to something that is realistic for your data. The example below presumes 1000 individuals, with each individual having a maximum of 10 network ties (i.e, 10 alters), but you might have a very different kind of network, e.g., a small number of nodes with lots of ties. Data with a large number of nodes should be the worst case, since -nwset- creates a square matrix in Mata of size N of nodes X N of nodes.

    Code:
    // Make example edgelist data.
    nwclear // for -nwcommands-
    set seed 75465
    local nnode = 1000   // change as desired
    local maxalter = 10  // change as desired
    clear
    set obs `nnode'
    gen str v1 = strofreal(_n)
    gen int nalter = runiformint(1, `maxalter')
    expand nalter
    gen str v2 = strofreal(runiformint(1,`nnode'))
    // Undirected network, so drop re-ordered pairs and self pairs.
    gen byte swap = v1 > v2
    gen temp = v1
    replace v1 = v2 if swap
    replace v2 = temp if swap
    duplicates drop v1 v2, force
    drop if v1 == v2
    drop temp swap
    //
    timer clear 1
    timer on 1
    nwset v1 v2, edgelist undirected name(2017) keeporiginal
    timer off 1
    quiet timer list 1
    // What was created and how long did it take?
    di _new "Elapsed time = " r(t1) " sec"
    nwset  // Does the network named _2017 exist?
    nwsummarize _2017
    Last edited by Mike Lacy; 04 Mar 2023, 09:52.

    Comment


    • #3
      Thank you for your comment, Sir. The method you offered is good because you are trying to test the maximum variable number, which i have not thought about before. I'd give it a try with this one. Speaking of my data, I have network files concerning fund corporations from 2009 to 2021 and they're large in number. I successfully create a matrix of 2009 to 2016 with the command given but failed to create one when it comes to 2017 (involving more than 10k pairs and about 4k nodes). So that's why I assume that Stata, or this command pack cannot handle more vars.

      Comment


      • #4
        I just ran the code above with 4000 nodes and maxalter = 10, which gave about 22000 edges, and it ran fine on my machine in 9 sec. If you want more help, run that example on your machine and report back here what happened. If it does work, run the -nwsummarize- command, and see if the results make sense.

        Again, if you want further help, I'd also encourage you to give answers to my two questions above, 1) and 2).

        Comment


        • #5
          'nnode=1000' one was fine, but when it comes to 1500, 2000.., it turned out that the matrix didn't work..
          I would try writing email to the developer about this problem., or other social network analysis software since time is limited.
          Thank you Sir!
          Best Regards

          Comment


          • #6

            I have no problem even at nnode = 4000, so yes, submitting your problem to [email protected] would be a good idea. In your message, you'll want to describe/show in detail what "the matrix didn't work" means.



            Comment

            Working...
            X