Merge Matrices in Mata according to id column?

Xu Lin

Join Date: Oct 2020

Posts: 1
#1

Merge Matrices in Mata according to id column?

16 Oct 2020, 09:40

Hi! I want to ask how to merge two matrices in Mata according to id column? This question comes from that I want to do reshape wide procedure in Mata for an unbalanced panel. Or a simpler question is how to generate an indicator column to indicate whether some elements is another set? Like A=(1\2\3\4), B=(1\2). The result I want is a column vector whether A is in B= (1\1\0\0). Thank you!

Best regards,
Lin
Tags: None

Andrew Musau

Join Date: Oct 2014
Posts: 10197

16 Oct 2020, 11:55

Like A=(1\2\3\4), B=(1\2). The result I want is a column vector whether A is in B= (1\1\0\0).

Here is one way:

Code:

mata
A=(1\2\3\4)
B=(1\2)
wanted=A[1..rows(B)]:==B\((rows(A)-rows(B)+1::rows(A))*0)
wanted
end

Res.:

Code:

: wanted=A[1..rows(B)]:==B\((rows(A)-rows(B)+1::rows(A))*0)

: 
: wanted
       1
    +-----+
  1 |  1  |
  2 |  1  |
  3 |  0  |
  4 |  0  |
    +-----+

Comment

daniel klein

Join Date: Mar 2014

Posts: 3851
#3

17 Oct 2020, 01:01

Andrew's solution is probably fast. However, I believe it only applies to the very specific situation in which all values appear in the exact same position in the two vectors.

I have implemented a more general solution on my elabel package (SSC). Here is a slightly revised example

Code:

*ssc install elabel mata : A = (1\2\3\4) B = (1\3) _aandb(A', B')' end

yields

Code:

: A = (1\2\3\4) : B = (1\3) : : _aandb(A', B')' 1 +-----+ 1 | 1 | 2 | 0 | 3 | 1 | 4 | 0 | +-----+

There are probably faster ways of doing this than the one I have implemented. Also, going from this result to a combined matrix might still be a long way.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#4

17 Oct 2020, 06:15

Here is a generalization of Andrew's solution to Daniel's modification of the example:

Code:

: A = (1\2\3\4) : B = (1\3) : C = J(rows(A), 1, 0) : C[B :+ A[1]] = J(rows(B), 1, 1) : C 1 +-----+ 1 | 1 | 2 | 0 | 3 | 1 | 4 | 0 | +-----+

Edit: The above obviously only works if A is a vector of consecutive integers starting from 1. If you want it to start at a different number and it is guaranteed that B is a subset of A, you could do:

Code:

: A = (2\3\4\5) : B = (2\4) : C = J(rows(A), 1, 0) : C[B :- (A[1]-1)] = J(rows(B), 1, 1) : C 1 +-----+ 1 | 1 | 2 | 0 | 3 | 1 | 4 | 0 | +-----+

If A is not guaranteed to be a vector of running indices, then things would become more tricky. But given the purpose mentioned by Xu in his opening post, the above might be sufficient.

Last edited by Sebastian Kripfganz; 17 Oct 2020, 06:25.

https://www.kripfganz.de/stata/
Comment

Announcement

Merge Matrices in Mata according to id column?

Comment

Comment

Comment