Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Intersection of two matrices

    This feels like a super basic question, but how do I find the intersection of two matrices in mata? Say I have two matrices A and B

    Code:
    A = (1, 2, 3)
    B = (2, 3, 4)
    What is then the best way to find the elements that occur in A and B?

    Code:
    (2, 3)
    Is the only way to do this to loop over all values of one matrix and see if selectindex() produces anything?

    PS: Some additional context. I want to do a heckman selection model with twoway clustered standard errors. I currently do this by running the Heckman three times and recalculating the variance matrix (V_twoway = V1 + V2 - V1&2). Unfortunately, I have a couple dozen million observations and not the best external instrument, so running the Heckman once can take days (187 iterations). Running it three times then isn't very appealing. Instead, I want to calculate the twoway clustered variance matrix directly from the Xs and residuals. At some point in this process I need to determine if an observation is part of both cluster 1 and cluster 2. I can determine the separate elements using selectindex(), but I'm struggling how to then determine that they are present in both.

    PS2: I suppose the question could be rephrased as, "how to do a select with two if-conditions".
    Last edited by Jesse Wursten; 13 Dec 2018, 09:55.

  • #2
    I have very little faith that I understand what you are trying to do, nor that what I propose below will scale to the size of your problem. But here's something that seems to address your small example.
    Code:
    : A = (1, 2, 3)
    
    : B = (2, 3, 4)
    
    : a = J(1,max((A,B)),0)
    
    : b = a
    
    : a[A] = J(1,cols(A),1)
    
    : a
           1   2   3   4
        +-----------------+
      1 |  1   1   1   0  |
        +-----------------+
    
    : b[B] = J(1,cols(B),1)
    
    : b
           1   2   3   4
        +-----------------+
      1 |  0   1   1   1  |
        +-----------------+
    
    : dupl = select(range(1,max((A,B)),1)',(a :& b))
    
    : dupl
           1   2
        +---------+
      1 |  2   3  |
        +---------+

    Comment


    • #3
      You say intersection of 2 matrices, but your examples are 2 vectors.
      Assuming intersection of 2 vectors, possibly with duplicates I suggest the following oneliner (love sunday afternoon puzzzles):


      Code:
      : A = 1..10
      
      : B = 4..6, 5..12
      
      : select((b = uniqrows(B')), rowsum(J(rows(b),1,uniqrows(A')') :== b))'
              1    2    3    4    5    6    7
          +------------------------------------+
        1 |   4    5    6    7    8    9   10  |
          +------------------------------------+
      Is this what you wanted?
      Kind regards

      nhb

      Comment

      Working...
      X