Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merge Matrices in Mata according to id column?

    Hi! I want to ask how to merge two matrices in Mata according to id column? This question comes from that I want to do reshape wide procedure in Mata for an unbalanced panel. Or a simpler question is how to generate an indicator column to indicate whether some elements is another set? Like A=(1\2\3\4), B=(1\2). The result I want is a column vector whether A is in B= (1\1\0\0). Thank you!


    Best regards,
    Lin

  • #2
    Like A=(1\2\3\4), B=(1\2). The result I want is a column vector whether A is in B= (1\1\0\0).
    Here is one way:

    Code:
    mata
    A=(1\2\3\4)
    B=(1\2)
    wanted=A[1..rows(B)]:==B\((rows(A)-rows(B)+1::rows(A))*0)
    wanted
    end
    Res.:

    Code:
    : wanted=A[1..rows(B)]:==B\((rows(A)-rows(B)+1::rows(A))*0)
    
    : 
    : wanted
           1
        +-----+
      1 |  1  |
      2 |  1  |
      3 |  0  |
      4 |  0  |
        +-----+

    Comment


    • #3
      Andrew's solution is probably fast. However, I believe it only applies to the very specific situation in which all values appear in the exact same position in the two vectors.

      I have implemented a more general solution on my elabel package (SSC). Here is a slightly revised example

      Code:
      *ssc install elabel
      
      mata :
      
      A = (1\2\3\4)
      B = (1\3)
      
      _aandb(A', B')'
      
      end

      yields

      Code:
      : A = (1\2\3\4)
      
      : B = (1\3)
      
      : 
      : _aandb(A', B')'
             1
          +-----+
        1 |  1  |
        2 |  0  |
        3 |  1  |
        4 |  0  |
          +-----+

      There are probably faster ways of doing this than the one I have implemented. Also, going from this result to a combined matrix might still be a long way.

      Comment


      • #4
        Here is a generalization of Andrew's solution to Daniel's modification of the example:
        Code:
        : A = (1\2\3\4)
        
        : B = (1\3)
        
        : C = J(rows(A), 1, 0)
        
        : C[B :+ A[1]] = J(rows(B), 1, 1)
        
        : C
               1
            +-----+
          1 |  1  |
          2 |  0  |
          3 |  1  |
          4 |  0  |
            +-----+
        Edit: The above obviously only works if A is a vector of consecutive integers starting from 1. If you want it to start at a different number and it is guaranteed that B is a subset of A, you could do:
        Code:
        : A = (2\3\4\5)
        
        : B = (2\4)
        
        : C = J(rows(A), 1, 0)
        
        : C[B :- (A[1]-1)] = J(rows(B), 1, 1)
        
        : C
               1
            +-----+
          1 |  1  |
          2 |  0  |
          3 |  1  |
          4 |  0  |
            +-----+
        If A is not guaranteed to be a vector of running indices, then things would become more tricky. But given the purpose mentioned by Xu in his opening post, the above might be sufficient.
        Last edited by Sebastian Kripfganz; 17 Oct 2020, 06:25.
        https://twitter.com/Kripfganz

        Comment

        Working...
        X