Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to read non-collinear factor variables into mata

    I want to read factor variables into mata and omit the base level from the resulting matrix. The code below:

    Code:
    . clear
    . set obs 10
    number of observations (_N) was 0, now 10
    . gen x = mod(_n, 3)
    . mata st_data(., "i.x")
           1   2   3
        +-------------+
      1 |  0   1   0  |
      2 |  0   0   1  |
      3 |  0   0   0  |
      4 |  0   1   0 
      5 |  0   0   1  |
      6 |  0   0   0  |
      7 |  0   1   0  |
      8 |  0   0   1  |
      9 |  0   0   0  |
     10 |  0   1   0  |
        +-------------+
    gives a column of 0s at the start. I thought of removing the base level from the variable list, but this didn't work:

    Code:
    . mata st_data(., "1.x 2.x")
           1   2
        +---------+
      1 |  0   0  |
      2 |  0   1  |
      3 |  0   0  |
      4 |  0   0  |
      5 |  0   1  |
      6 |  0   0  |
      7 |  0   0  |
      8 |  0   1  |
      9 |  0   0  |
     10 |  0   0  |
        +---------+
    You can see I still get a column of 0s, which I found startling. The only workaround I could come up with was something along the lines of

    Code:
    mata X = st_data(., "i.x")
    mata select(X, !(colsum(X :== 0) :== rows(X)))
    which seems very clunky. Is there a better way to do this? Incidentally, I find the parsing of factor variables into mata a bit odd. For instance, in my example:

    Code:
    . gen y = _n
    . mata st_data(., "1.x y 2.x")
            1    2    3
        +----------------+
      1 |   0    0    1  |
      2 |   0    1    2  |
      3 |   0    0    3  |
      4 |   0    0    4  |
      5 |   0    1    5  |
      6 |   0    0    6  |
      7 |   0    0    7  |
      8 |   0    1    8  |
      9 |   0    0    9  |
     10 |   0    0   10  |
        +----------------+
    So 1.x and 2.x are grouped together in a different order than the one requested.

  • #2
    Originally posted by Mauricio Caceres View Post
    I want to read factor variables into mata and omit the base level from the resulting matrix. . . . Is there a better way to do this?
    Try something like the following.
    Code:
    X = st_data(., ("1bn.x", "2.x"))
    See below for an example.

    Incidentally, I find the parsing of factor variables into mata a bit odd. . . .1.x and 2.x are grouped together in a different order than the one requested.
    I think that that has to do with the fact that these factor-variable indicator variables are somewhat like phantoms, they're generated on the fly, and so are irrevocably together when referenced. This happens even in Stata when you refer to the factor variables in the abstract like that. See below ("Second question").

    .ÿ
    .ÿversionÿ17.0

    .ÿ
    .ÿclearÿ*

    .ÿ
    .ÿquietlyÿsetÿobsÿ6

    .ÿ
    .ÿgenerateÿbyteÿxÿ=ÿmod(_n,ÿ3)

    .ÿlistÿxÿ0.xÿ1.xÿ2.x,ÿnoobsÿseparator(0)

    ÿÿ+------------------+
    ÿÿ|ÿÿÿÿÿÿ0.ÿÿÿ1.ÿÿÿ2.|
    ÿÿ|ÿxÿÿÿÿxÿÿÿÿxÿÿÿÿxÿ|
    ÿÿ|------------------|
    ÿÿ|ÿ1ÿÿÿÿ0ÿÿÿÿ1ÿÿÿÿ0ÿ|
    ÿÿ|ÿ2ÿÿÿÿ0ÿÿÿÿ0ÿÿÿÿ1ÿ|
    ÿÿ|ÿ0ÿÿÿÿ1ÿÿÿÿ0ÿÿÿÿ0ÿ|
    ÿÿ|ÿ1ÿÿÿÿ0ÿÿÿÿ1ÿÿÿÿ0ÿ|
    ÿÿ|ÿ2ÿÿÿÿ0ÿÿÿÿ0ÿÿÿÿ1ÿ|
    ÿÿ|ÿ0ÿÿÿÿ1ÿÿÿÿ0ÿÿÿÿ0ÿ|
    ÿÿ+------------------+

    .ÿ
    .ÿlocalÿline_sizeÿ`c(linesize)'

    .ÿsetÿlinesizeÿ80

    .ÿ
    .ÿmata:
    -------------------------------------------------ÿmataÿ(typeÿendÿtoÿexit)ÿ------
    :ÿXÿ=ÿst_data(.,ÿ("1bn.x",ÿ"2.x"))

    :ÿX
    ÿÿÿÿÿÿÿ1ÿÿÿ2
    ÿÿÿÿ+---------+
    ÿÿ1ÿ|ÿÿ1ÿÿÿ0ÿÿ|
    ÿÿ2ÿ|ÿÿ0ÿÿÿ1ÿÿ|
    ÿÿ3ÿ|ÿÿ0ÿÿÿ0ÿÿ|
    ÿÿ4ÿ|ÿÿ1ÿÿÿ0ÿÿ|
    ÿÿ5ÿ|ÿÿ0ÿÿÿ1ÿÿ|
    ÿÿ6ÿ|ÿÿ0ÿÿÿ0ÿÿ|
    ÿÿÿÿ+---------+

    :ÿend
    --------------------------------------------------------------------------------

    .ÿ
    .ÿsetÿlinesizeÿ`line_size'

    .ÿ
    .ÿ*
    .ÿ*ÿSecondÿquestion
    .ÿ*
    .ÿgenerateÿbyteÿyÿ=ÿ_n

    .ÿlistÿxÿ0.xÿyÿ1.xÿ2.x,ÿnoobsÿseparator(0)

    ÿÿ+----------------------+
    ÿÿ|ÿÿÿÿÿÿ0.ÿÿÿ1.ÿÿÿ2.ÿÿÿÿ|
    ÿÿ|ÿxÿÿÿÿxÿÿÿÿxÿÿÿÿxÿÿÿyÿ|
    ÿÿ|----------------------|
    ÿÿ|ÿ1ÿÿÿÿ0ÿÿÿÿ1ÿÿÿÿ0ÿÿÿ1ÿ|
    ÿÿ|ÿ2ÿÿÿÿ0ÿÿÿÿ0ÿÿÿÿ1ÿÿÿ2ÿ|
    ÿÿ|ÿ0ÿÿÿÿ1ÿÿÿÿ0ÿÿÿÿ0ÿÿÿ3ÿ|
    ÿÿ|ÿ1ÿÿÿÿ0ÿÿÿÿ1ÿÿÿÿ0ÿÿÿ4ÿ|
    ÿÿ|ÿ2ÿÿÿÿ0ÿÿÿÿ0ÿÿÿÿ1ÿÿÿ5ÿ|
    ÿÿ|ÿ0ÿÿÿÿ1ÿÿÿÿ0ÿÿÿÿ0ÿÿÿ6ÿ|
    ÿÿ+----------------------+

    .ÿ
    .ÿexit

    endÿofÿdo-file


    .


    I think that you'd have to resort to something like
    Code:
    stata("fvrevar i.x, stub(_)"}
    and create them concretely in order to refer to them in any arbitrary order.

    Comment


    • #3
      Joseph Coveney Thanks for the thorough reply! I didn't realize you could set a variable to have no base. I think temporarily doing this is the workaround to use. For my reference,

      Code:
      fvset base none x
      mata st_data(., "1.x 2.x")
      fvset clear x

      Comment

      Working...
      X