Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Estimating Ordinal Reliability for test scores using Stata 12.

    Hi

    I currently use Stata 12.

    I have 25 test items assessing different components of clinical reasoning. Scores for each item range from 0-2 such that: 0 = wrong diagnosis, 1 = partially correct and 2 = Correct.

    I'm tasked with assessing reliability for each facet of clinical reasoning; because the items are scored in a ranked ordinal fashion, I cannot use Cronbach's Alpha. Based on a 2012 paper authored by Graderman, Guhn & Zumbo , ordinal reliability coefficients can be computed. The process involves estimating a polychoric correlation matrix & then calculating reliability from that matrix. They illustrate how it can be done in R, but with respect to performing this procedure in Stata, they say -- & I quote:

    "~The procedures to obtain polychoric correlation matrices in (proprietary) software programs -- such as MPlus, SAS, Stata & Prelis/ Lisrel involve more elaborate syntax (additional macros), and /or one would have to calculate ordinal reliability "by hand" once the polychoric correlation matrix is obtained"

    The article was dated January 2012 before Stata 12 came out.... I'm wondering, if it is this still the case-- that computing polychoric correlations in Stata is complicated...and that there is no way to obtain ordinal reliability coefficients other than by hand calculation? If not, could someone please point me to any references that can show me how to compute reliability coefficients for the data structure that I have?

    Thank you! The article is attached for your convenience should you be interested in reading it....


    Katherine Picho

  • #2
    sorry here is the attachment

    Comment


    • #3
      sorry here is the attachment
      Attached Files

      Comment


      • #4
        You can easily get polychoric by "findit polychoric" and installing. Then the basic idea is straightforward enough: compute Alpha from the Polychorics, not the Pearsons. "factormat" will get you the numbers you need to do the computations, which with a spreadsheet don't look to onerous. You have the loadings, the uniquenesses (from which you can get the communalities), and good to go.

        See http://www.stata.com/statalist/archi.../msg00696.html and see if you can figure it out his code to do it directly. Or, Dr. Coveney just might see this post and chime in.

        I'm not too good with manipulating matrices, but see code below to get you to the manipulate matrices or get out the spreadsheet stage. I included an example spreadsheet.

        Code:
        *=======prepwork:
        clear
        set obs 200
        gen latent=rnormal()
        forvalues i=1/10 {
            gen x`i'=rnormal()+latent
            replace (x`i')=round(x`i')
            }
        sum
        corr
        
        *=======all that matters for your purposes:
        polychoric x*
        matrix r = r(R)
        factormat r, n(200) factors(1)
        Attached Files
        Last edited by ben earnhart; 21 Nov 2014, 12:41. Reason: Attached example of hand computation

        Comment


        • #5
          findit polychoric gives you what you need.
          Download the commands "polychoric" and "polychoricpca" by Stas Kolenikov. They seems to do what you want.

          Comment


          • #6
            Dear Ben and Paul

            thank you so much for your help on this. I wouldn't have otherwise known where to start.....

            sincerely,

            Katherine Picho

            Comment


            • #7
              Hello Ben

              I have a quick question regarding your excel sheet.

              I assume based on your code that n should be replaced with my sample n, correct?

              thanks!!
              Last edited by Katherine Picho; 25 Nov 2014, 11:44.

              Comment


              • #8
                Yup, n is your n in the Stata code for -factormat-.

                However, in the spreadsheet, it doesn't use n at all, so want to make sure we're on the same page. The spreadsheet uses loadings, uniquenesses (to get communialities), and the # of items. No n in the formlua.

                Comment


                • #9
                  Thanks! I'm running into problems with getting the polychoric matrix. My variables do have missing values...some more than others...and since the codes are 0-1 or 2, some test items were done very well (i.e. almost everyone got a '2') and others were done 50/50 (i.e most 1s)...therefore I keep getting an error message to the tune of:

                  1. no variability in qtn17,
                  2. cant calculate numeric derivatives,
                  missing values encountered...
                  No variables defined.

                  3. Matrix R has missing values.

                  when I check the error code for #3, I get the message from stata that
                  " matrix has missing values;
                  This return code is now infrequently used because, beginning
                  with version 8, Stata now permits missing values in matrices.
                  But if Stata permits missing values in matrices...what is the option to specify that ...? I don't know.

                  How do I address missing values or lack of variability so as to get the matrix!! I'm pretty sure there has to be a way because I cant imagine there aren't datasets that have no missing values...

                  I tried the mi set command, but it did not work...
                  Last edited by Katherine Picho; 26 Nov 2014, 06:52.

                  Comment


                  • #10
                    Did you try the option "pw?" It will compute pairwise correlations (each correlation is based on all complete cases for the *pair*), instead of the default listwise deletion (missing on *anything*, entire case thrown out).

                    This is a bit of a desperation move, and has (at least) two downsides.

                    First, your n, and all statistical tests, have a questionable sample size, and since standard errors (and thus statistical tests) are derived from n, they're mucked up. I don't see this as a horrible problem in your situation, since you aren't doing anything that depends on correct standard errors. For the -factormat- stage, just try it with an average of the # of cases, and maybe do sensitivity by feeding it the max n, and the casewise n, and see what happens to the end result (Alpha). It shouldn't change much, in which case, you're OK.

                    Second downside is the possibility of a not-positive-definite matrix. You could get an impossible correlation matrix, where, given the pattern of correlations between several items, a given correlation is impossible. But again, proof is in the pudding, and just have to see how it reacts.

                    -mi set- is a first step in the direction of multiple imputation, but for that, you'll need additional outside variables, declare the imputation model(s), and generate multiple imputations before using them. And for that matter, I'm not sure it would work, since -factormat- works off of matrices, not cases. So I'm not sure mi is a good way to go.

                    Finally, since you seem to have multiple factors, you *might* try generating multiple matrices with just the relevant variables. E.g. you have one matrix with items 1-6, another with items 7-14, and another with items 15-25. This might have much of the advantage of pairwise deletion without risking non-positive-definite matrices. But again, these are all desperation moves.

                    Comment


                    • #11
                      I realise we are 18 months on from the OP, but I wonder if anyone has found an acceptable solution to this problem, as I am experiencing the same issue.

                      Using STATA IC 12, I am trying to calculate ordinal reliability alpha from data collected on a new 17-item scale, in 30 participants. Scores on each item range from 0-3, e.g.: 0 = never, 1 = occasionally, 2 = Often, 3 = very often.

                      My data has no missing values. For most of the 17 items, there is the full range of answers present in the sample (i.e. at least one person has answered 0,1,2 or 3) but for 5 of the 17 items, only 3 of the 4 possible responses have been given by my sample.

                      I have used the code above:
                      polychoric x1-17 matrix r = r(R) factormat r, n(30) factors(1) After the polychoric command, the following error message appears (although the polychoric matrix is generated nonetheless):[INDENT=2]could not calculate numerical derivatives
                      missing values encountered

                      numerical derivatives are approximate
                      nearby values are missing[/INDENT]
                      And then the factormat command does not generate any output, only the following error message:[INDENT=2]matrix r has missing values
                      r(504);[/INDENT]
                      Searching the error code doesn't help: "This return code is now infrequently used because, beginning with version 8, Stata now permits missing values in matrices."



                      It seems that the polychoric matrix is not saving correctly but I am not sure what is causing this problem. I thought it may be related to the different range of responses to different questions, but running a test on some made up data with the full range for every item still leads to the same problem.

                      Can anyone suggest a solution to this?

                      Comment


                      • #12
                        The above has formatted in a strange way. For clarity, the first warning message when computing the polychoric correlation matrix is:

                        could not calculate numerical derivatives
                        missing values encountered

                        numerical derivatives are approximate
                        nearby values are missing

                        And the message for the factor analysis is:
                        matrix r has missing values
                        r(504)

                        Comment

                        Working...
                        X