Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Count pairs of categorical variables regardless order

    Hi!
    I'm trying to count occurrences of a pair of values for two categorical variables element1 and element2 regardless the order.
    element1 element2
    A B
    B A
    C A
    A C
    A C
    A B
    B C
    C B
    C B
    I'd like to count the number of occurrences of (A;B) and (B;A), that is 2*(A;B) + 1*(B;A) = 3.
    Similarly, for the pair (B;C), I would like the sum of (B;C) pairs and (C;B) pairs, that is 1+2 = 3.

    SSC's groups counts (A;B) and (B;A) separately.

    Thanks a lot in advance!

    Thierry Geiger

  • #2
    The solution is to reorder values such that the smallest value is first and the other second.

    Code:
    clear
    input str1 element1    str1 element2
    A    B
    B    A
    C    A
    A    C
    A    C
    A    B
    B    C
    C    B
    C    B
    end
    
    list
    
    generate first = cond(element1 < element2, element1, element2)
    generate second = cond(element1 < element2, element2, element1)
    bysort first second: gen N = _N
    by first second: gen one = _n == 1
    list if one
    See http://www.stata-journal.com/sjpdf.h...iclenum=dm0043
    SJ-8-4 dm0043 . Tip 71: The problem of split identity, or how to group dyads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox Q4/08 SJ 8(4):588--591 (no commands) tip on how to handle dyadic identifiers
    Last edited by Robert Picard; 08 Sep 2014, 08:54.

    Comment


    • #3
      Brilliant! Thanks a lot, Rob. Beautifully compelling.

      TG

      Comment


      • #4
        Well Nick deserves most of the credit, his Stata Journal article on grouping dyads is a good read.

        Comment


        • #5
          Thanks to Robert for the plug. In turn, that paper probably wouldn't exist without various Statalist questions in the years previous.

          Comment

          Working...
          X