Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating uniform and non-uniform DIF for polytomous data

    Dear Stata-list,

    I am hoping to learn how to perform a uniform and non-uniform DIF analysis using polytomous (1-5 scale) questionnaire data.

    I am currently using the dropdown menu Statistics >> IRT >> DIF option, but stata seems to be unable to calculate because my item responses range from 1-5:

    Code:
     difmh q15 q16 q28 q29 q30 q31, group(_age75)
    variable q15 has invalid values;
     requires item variables be coded 0, 1, or missing
    r(198);
    Any advice would be greatly appreciated.

    William

  • #2
    There are two ways to do it.

    I don't exactly know what the difmh command is doing. If you and your advisors have sound theoretical reasons to be dead set on that method, then hopefully someone else will respond.

    However, the diflogistic command for binary data really takes the sum score of your items as an estimator for theta, and it fits a series of logistic regressions using each item as the DV and either theta alone, theta and focal group, or c.theta##i.focal group (for no DIF, uniform DIF, and non-uniform DIF respectively). Why? The rationale is that if our IRT model fits, then we know the probability of endorsing each item (or the probability of endorsing any particular category of an item) should only depend on the value of the latent trait, theta. If that probability varies by group, we have DIF!

    You can home-brew diflogistic for ordinal items, as detailed in the last few posts on this thread. You don't even need Stata 16 to do this. You could have done this as far back as Stata 13. Based on some articles and an R package I found, I have discovered that the change in pseudo-R^2 among the models (conceptually like Cohen's F^2 measure of effect size) can be used to estimate effect size. I have an enormous number of people in my dissertation data, so everything is statistically significant even if I correct conservatively for multiple testing using the usual likelihood ratio tests. I outlined this issue in those posts. You can also write a loop over all your items, as outlined.

    The other way is to use the new functionality in irt that enables us to fit multi-group models. In this scheme, you fit a multi-group model (using focal vs reference group), but you initially constrain all the item parameters equal between the two groups (meaning that the mean theta and Var(theta) for the reference group are freely estimated. Then, for each item, you release the difficulty parameter, then the difficulty + discrimination parameters. Then you can LR test the three resulting models. This is more computationally intensive, but if you only have 6 items, this isn't fatal unless you have a great many people.

    The syntax in the PDF manual gives examples for binary items. For ordinal items, it would look more like this, adapting your syntax:

    Code:
    irt grm q15 q16 q28-q31, group(_age75)
    estimates store invariant
    
    irt (grm q16 q28-q31) (0: grm q15, cns(a@k)) (1: grm q15, cns(a@k)), group(_age75)
    estimates store uniform_q15
    lrtest invariant uniform_q15
    Let's talk about what this is doing. The prefixes 0: and 1: tell Stata that models are being fit separately for groups in the group variable (_age75; I'm assuming you coded them as 0 and 1, change the values if not). The option cns(a@k) tells Stata to constrain the discrimination parameter (frequently referred to as the a parameter in IRT literature) at k. k is a symbolic constant. You could name it any valid value, just as long as both groups get it constrained at the same symbolic constant. You now have a model where there's uniform DIF in Q15.

    Then, you LR test the model with uniform DIF in the first question versus the model where there's no DIF.

    The model for non-uniform DIF in Q15 is a lot simpler to write:

    Code:
    irt (grm q16 q28-q31) (0: grm q15) (1: grm q15), group(_age75)
    estimates store nonuniform_q15
    lrtest uniform_q15 nonuniform_q15
    Why the last LR test? I'm not sure what standard practice is, actually. However, the diflogistic command only tests invariant versus uniform, and then uniform vs non-uniform. If the latter test rejects, then by definition non-uniform vs invariant has to reject also. I'm not sure if it's required to test non-uniform versus invariant. If you're on Stata 15, I'm not sure how to do this, because group option doesn't exist in the irt set of commands (it exists in the underlying gsem command).

    Anyway, you are fortunate that you only have six items, because if you go this route, you'll have to manually write out models where each item gets the uniform and non-uniform DIF treatment. The standard procedure I learned is to vary only one item at a time, and constrain all the rest. I think many observers will recommend correcting for multiple tests, and it is known that the Bonferroni correction is overly conservative (although it's very simple to do). The Benjamini-Hochberg correction is described in a post on the thread I linked, and is less conservative than Bonferroni.

    Do note that these are two different models for detecting DIF. I am not sure if one is theoretically preferred over the other all the time or in some situations. They may not always produce equivalent results. I am not up with the full state of the theoretical literature in IRT on this issue.
    Last edited by Weiwen Ng; 22 Apr 2020, 17:04.
    Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

    When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

    Comment


    • #3
      Dear Weiwen,

      Thanks so much for such a comprehensive answer.

      I understand that he individual item score should only depend on the latent trait, and if there is difference in demographic subgroup (i.e. _age75), then we have DIF!
      And I understand your diflogistic explanation above, but get a bit lost in your thorough explanation of DIF for ordinal items, which I assume is primarily this:

      "
      In another class of methods, you can take advantage of the fact that if you have a correctly-fitting IRT model, the probability of a positive/correct response or the probability of responding in a higher category should only depend on the value of the latent trait. If you fit a logistic or ordered logistic model treating each question as the dependent variable and an estimate of ability as the independent variable, you know you should not see differences by group. If you do observe differences by group, you know you have DIF. Forgive the lack of proper equation formatting, but if you fit a logistic or ordered logistic model to any one question indexed by i, you know that model 1 below will be true:

      P(Y_i = 1) = invlogit{tau_0 + tau_1 * theta^hat} (model 1)

      If there's uniform DIF, you would instead see that this model fits better than the above. Here, the log odds of endorsing (a higher category of) each item would differ by a constant amount for the focal group.

      P(Y_i = 1) = invlogit{tau_0 + tau_1 * theta^hat + tau_2 * group} (model 2)

      And, if there was non-uniform DIF, you'd see that this model fits the best:

      P(Y_i = 1) = invlogit{tau_0 + tau_1 * theta^hat + tau_2 * group + tau_3 * group * theta^hat} (model 3)"


      ....

      Code:
        
       diflogistic q?, group(female)     
       egen sum = rowtotal(q?) quietly logit q1 sum est store q1_base quietly logit q1 sum i.female est store q1_unif quietly logit q1 c.sum##i.female est store q1_nonunif     
       lrtest q1_nonunif q1_unif     
       lrtest q1_unif q1_base



      If you have time to describe how I could apply your (above) explanation to writing code for my questions, that would be so helpful?
      (I have two factors, factor 1 (loading items 15-16, 28-31) and factor 2 (a different set of 15 items).

      Thanks so much again Weiwen. William

      Comment


      • #4
        Dear Weiwen,

        In particular, when I try to run my code:

        Code:
        irt grm q15 q16 q28 q29 q30 q31
        
        regress q15 _age75
        
        egen sum15=rowtotal(q15)
        
        quietly regress q15 sum15
        est store q15_base
        quietly regress q15 sum15 _age75
        est store q15_unif
        quietly regress q15 c.sum15##_age75
        est store q15_nonunif
        
        lrtest q15_nonunif q15_unif
        
        lrtest q15_unif q15_base
        I get the following error message: q15_nonunif does not contain scalar e(ll) r(498); q15_unif does not contain scalar e(ll) r(498);

        I'm (trying to use) a linear model because of the number of options in our Likert scale, which I know isn't necessarily appropriate. But when I try with the "logistic" command, I get the following error message:

        Code:
        outcome does not vary; remember:
                                          0 = negative outcome,
                all other nonmissing values = positive outcome
        r(2000);
        William

        Comment


        • #5
          Originally posted by William Mitchell View Post
          Dear Weiwen,

          In particular, when I try to run my code:

          Code:
          irt grm q15 q16 q28 q29 q30 q31
          
          regress q15 _age75
          
          egen sum15=rowtotal(q15)
          
          quietly regress q15 sum15
          est store q15_base
          quietly regress q15 sum15 _age75
          est store q15_unif
          quietly regress q15 c.sum15##_age75
          est store q15_nonunif
          
          lrtest q15_nonunif q15_unif
          
          lrtest q15_unif q15_base
          I get the following error message: q15_nonunif does not contain scalar e(ll) r(498); q15_unif does not contain scalar e(ll) r(498);

          I'm (trying to use) a linear model because of the number of options in our Likert scale, which I know isn't necessarily appropriate. But when I try with the "logistic" command, I get the following error message:

          Code:
          outcome does not vary; remember:
          0 = negative outcome,
          all other nonmissing values = positive outcome
          r(2000);
          William
          Logistic regression is for binary outcomes, and ordinal logistic regression is for ordered categorical outcomes. With ordinal items, you would use ordinal logistic regression, i.e. the ologit command. As the error message alluded to, the logistic command treats 0 as negative, and every other response as positive; hence, when you feed an item into logistic, it thinks that all the outcomes are positive (I remember that you have your items coded 1 through 5 from your other post).

          If your items really are coded 1 through 5, that's only 5 response categories. I don't think that's too many to handle. However, I just remembered: you should tabulate the responses to each of your items. Do you have enough of a sample size responding in each category? If too few people respond to some categories, you could have estimation issues for ologit. If and only if you do, you should probably collapse some of the response categories with too few responses into an adjacent category. In any case, linear regression would definitely be the wrong model to do this; it assumes that the distance between each category is identical, and that is not true.

          As to coding, post #8 in the link did show some suggested code, although you have to manually copy and paste all the output to Excel. Post #9 showed some additional code that should automate the process. It does seem like your question in post #4 supersedes post #3. Do you still have an issue coding things?
          Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

          When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

          Comment


          • #6
            Dear Weiwen,

            Thanks so much for your reply;

            My code now reads as follows:
            Code:
            irt grm q15 
            
            ologit q15 _age75
            egen sum15=rowtotal(q15)
            
            quietly ologit q15 sum15
            est store q15_base
            quietly ologit q15 sum15 _age75
            est store q15_unif
            quietly ologit q15 c.sum15##_age75
            est store q15_nonunif
            
            lrtest q15_nonunif q15_unif
            lrtest q15_unif q15_base
            However, I'm getting the following error messages:
            Code:
            . quietly ologit q15 sum15
            convergence not achieved
            convergence not achieved
            r(430);
            Code:
            . quietly ologit q15 sum15 _age75
            convergence not achieved
            convergence not achieved
            r(430);
            As an aside, I will check the sample sizes in each category. There were 900 respondents, and <10% missingness... but the more extreme option (i.e. option 5) did sometimes have fewer than 20 (but always >10) "selections"... so this may be too small ..

            William

            Comment


            • #7
              Originally posted by William Mitchell View Post
              Dear Weiwen,

              Thanks so much for your reply;

              My code now reads as follows:
              Code:
              irt grm q15
              
              ologit q15 _age75
              egen sum15=rowtotal(q15)
              
              quietly ologit q15 sum15
              est store q15_base
              quietly ologit q15 sum15 _age75
              est store q15_unif
              quietly ologit q15 c.sum15##_age75
              est store q15_nonunif
              
              lrtest q15_nonunif q15_unif
              lrtest q15_unif q15_base
              However, I'm getting the following error messages:
              Code:
              . quietly ologit q15 sum15
              convergence not achieved
              convergence not achieved
              r(430);
              Code:
              . quietly ologit q15 sum15 _age75
              convergence not achieved
              convergence not achieved
              r(430);
              As an aside, I will check the sample sizes in each category. There were 900 respondents, and <10% missingness... but the more extreme option (i.e. option 5) did sometimes have fewer than 20 (but always >10) "selections"... so this may be too small ..

              William
              You can delete the quietly prefix to see if any of the output is informative. That said, depending on the number of cases, I'd think about collapsing category 5 into 4 for the items where you have that few cases.

              If anybody is reading this and is wondering how they would do this, here is a suggestion. There's a recode command, and it has an option to generate a new variable - you may not wish to recode the underlying data. Alternatively, you can feed recode a list of variables, and have it prefix the recoded variables with something:

              Code:
              recode q15 q16 (5 = 4), prefix(rec)
              Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

              When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

              Comment

              Working...
              X