Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sequentially eliminating row&column from correlation matrix

    Hi everybody:

    I can't find a way of creating a sub correlation matrix from a bigger one: I need to eliminate a row (and the corresponding column), in order to create a smaller correlation matrix that excludes specifically one variable.

    A bit of context: using code that Joseph Coveney posted time ago to compute ordinal alphas (Zumbo et al. Journal of Modern Applied Statistical Methods 2007) I wrote a tiny rclass command called ordalpha that reads a list of variable names, computes the polychoric correlation matrix, computes ordinal alpha and stores its value in r(ordalpha) (just to be able to get a 95%CI using bootstrap). Now I have been asked to compute also the alpha coefficients for each item when that item is removed from the scale. Since the scale I was studying has 32 items, it means I have to write 32 statements like this:

    ordalpha das2-das32
    ordalpha das das3-das32
    ordalpha das1-das2 das4-das32

    And so on until

    ordalpha das1-das31

    The dataset has many cases, and the time needed to compute the polychoric matrix each time is several minutes (Stata 12.1 SE 64 bits). I have been trying to find a way to use a loop that at each run creates a correlation matrix that eliminates one row (and the corresponding column) from the original one and computes ordinal alpha without that variable. This way, the bottleneck of the process (computing the polychoric correlation matrix) would be done just once (instead of 32 times).

    I have downloaded -matselrc-, but I can't figure how to use it to eliminate one row/column, since it works the other way around (creates a matrix with the rows/columns indicated by the user).

    Thanks a lot in advance,
    Marta GG

  • #2
    Hi Marta,

    One mechanism to do so would be using select() in Mata. For example, assume the full polychoric correlation matrix was named C (already computed and in memory), and the matrix (less 1 row and column) is called subC. So for 32 different items:

    Code:
    forvalues i = 1/32 {
         mata: sel = J(1, 32, 1)
         mata: sel[1, `i'] = 0
         mata: st_matrix("subC", select(select(st_matrix("C"), sel), sel'))
       ...code for creating alpha on "subC" here...
    }
    The above code pulls the matrix C into Mata, takes out a single row and column (the same row and column), and puts it back in Stata. It's then ready to use to compute the ordinal alpha.

    - joe
    Joseph Nicholas Luchman, Ph.D., PStatĀ® (American Statistical Association)
    ----
    Research Fellow
    Fors Marsh

    ----
    Version 18.0 MP

    Comment


    • #3
      Hi Joe:

      I tried your idea, but I get the following error message:

      name conflict: row and column names of subC should match
      r(198);



      Guess I will have to study the Mata manual with detail.

      Thanks.
      Marta

      Comment


      • #4
        Hi Marta,

        As the error notes, the problem somewhere in a naming conflict and should not have been caused by Mata (in fact subC should not have any names associated with it unless they are explicitly named in Stata). The only time I have seen an error like that when ereturn post-ing a sampling variance-covariance matrix where the row and column names are not aligned (as they're supposed to be).

        That is, of course, my guess and it's hard to know without more detail on how the error arose but, again, it is unlikely to stem from the row and column removal directly. Have you looked to see whether the subC matrix looks as it should (i.e., probably should be "matrix list"-able and consist of matrix C with it's first row and column removed)?

        - joe
        Joseph Nicholas Luchman, Ph.D., PStatĀ® (American Statistical Association)
        ----
        Research Fellow
        Fors Marsh

        ----
        Version 18.0 MP

        Comment


        • #5
          Hi Joe:

          This is what I tried:

          Code:
          . qui polychoric das1-das5
          . matrix define C=r(R)
          . matrix list C
          
          symmetric C[5,5]
                     das1       das2       das3       das4       das5
          das1          1
          das2  .42024951          1
          das3  .24509033  .32364968          1
          das4  .36344476  .52974183  .29672763          1
          das5  .33015234  .50295552  .31500762  .43689823          1
          
          * Eliminating first row&column as test
          
          . mata: sel = J(1, 5, 1)
          . mata: sel[1, 1] = 0
          . mata: st_matrix("subC", select(select(st_matrix("C"), sel), sel'))
          . matrix list subC
          
          symmetric subC[4,4]
                     c1         c2         c3         c4
          r1          1
          r2  .32364968          1
          r3  .52974183  .29672763          1
          r4  .50295552  .31500762  .43689823          1
          
          . factormat subC, n(915) factors(1)
          name conflict: row and column names of subC should match
          r(198);
          It looks like I need to change the names of the rows&columns of subC before passing it to -factormat-

          I'll work on that.

          Thanks a lot for your time and interest.
          Marta

          Comment


          • #6
            SOLVED!

            Code:
            . local reqs: rownames subC
            . matrix colnames subC= `reqs'
            . factormat subC, n(915) factors(1)
            Now, to put everything inside the loop, and the loop inside the rclass -ordalpha-

            Thanks a lot!
            Marta

            Comment


            • #7
              Just one last question concerning formatting the output.

              Right now, all works OK:

              Code:
              . ordalpha das1-das4
              
              Sample size = 915
              Ordinal alpha = 0.6671
              
              Ordinal alpha if item removed:
              item1          0.5998
              item2          0.4994
              item3          0.6524
              item4          0.5362
              I have extracted the names of all the variables involved using:

              Code:
              .local items: rownames C
              I would like to the use that list to replace the generic "item1" "item2"... names by the real variable names ("das1" "das2"... in this case), to make the output look like this:

              Code:
              . ordalpha das1-das4
              
              Sample size = 915
              Ordinal alpha = 0.6671
              
              Ordinal alpha if item removed:
              das1          0.5998
              das2          0.4994
              das3          0.6524
              das4          0.5362
              How can I use that list inside a forvalues i=1/`p' loop? Something like "items(`i')"... (of course, that doesn't work)

              Thanks a lot,
              Marta

              Comment


              • #8
                The question wasn't detailed enough, sorry.

                I finally found the solution (assigning each alpha if item removed to a matrix, naming the rows using `items')

                Anyone interested in having ordalpha.ado ask for it.

                Marta

                Comment

                Working...
                X