Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is there a test to calculate the effect size for matchd pairs discere variables?

    My question is in the title. I have two variables ai_mh_Y2 and ai_mh_Y4, the ai_mh_Y2 variable asked "Are you willing to use artificial intelligence for mental health, (Yes/No)", in Wave 2 of a longitudinal survey. The ai_mh_Y4 variable is the follow-up, "Are you willing to use artificial intelligence for mental health, (Yes/No)" in Wave 4. Is there a test to measure the effect size for the matched samples? I know of McNemar's test of marginal homogenity but I don't think it's equivalent to calculating an effect size. Any advice would be appreciated. I included the frequency distributions of both variables and their joint frequency distribution as well, I surveyed the variables, but if a test is avaliable it doesn't have to account for the survey weights.


    Code:
    
    . svy: tabulate ai_mh_Y2, obs percent format(%14.3gc)
    (running tabulate on estimation sample)
    
    Number of strata =   1                            Number of obs   =        460
    Number of PSUs   = 460                            Population size = 466.561799
                                                      Design df       =        459
    
    ----------------------------------
    Would you |
    be        |
    willing   |
    to use    |
    artificia |
    l         |
    intellige |
    nce to    |
    help with |
    your      |
    mental    |
    hea       | percentage         obs
    ----------+-----------------------
         0.No |       89.8         403
        1.Yes |       10.2          57
              | 
        Total |        100         460
    ----------------------------------
    Key: percentage = Cell percentage
                obs = Number of observations
    
    . 
    . svy: tabulate ai_mh_Y4, obs percent format(%14.3gc)
    (running tabulate on estimation sample)
    
    Number of strata =   1                            Number of obs   =        460
    Number of PSUs   = 460                            Population size = 466.217099
                                                      Design df       =        459
    
    ----------------------------------
    Would you |
    be        |
    willing   |
    to use    |
    artificia |
    l         |
    intellige |
    nce to    |
    help with |
    your      |
    mental    |
    hea       | percentage         obs
    ----------+-----------------------
         0.No |       90.9         411
        1.Yes |       9.12          49
              | 
        Total |        100         460
    ----------------------------------
    Key: percentage = Cell percentage
                obs = Number of observations
    
    . 
    . 
    . svy: tabulate ai_mh_Y2 ai_mh_Y4, obs percent format(%14.3gc) 
    (running tabulate on estimation sample)
    
    Number of strata =   1                            Number of obs   =        350
    Number of PSUs   = 350                            Population size = 351.287799
                                                      Design df       =        349
    
    -------------------------------
    Would you |
    be        |
    willing   |
    to use    |
    artificia |
    l         |
    intellige |
    nce to    |Would you be willing
    help with |  to use artificial 
    your      |intelligence to help
    mental    |with your mental hea
    hea       |  0.No  1.Yes  Total
    ----------+--------------------
         0.No |  90.1   3.37   93.5
              |   304     13    317
              | 
        1.Yes |  3.97   2.54   6.52
              |    14     19     33
              | 
        Total |  94.1   5.91    100
              |   318     32    350
    -------------------------------
    Key: Cell percentage
         Number of observations
    
      Pearson:
        Uncorrected   chi2(1)         =   48.0193
        Design-based  F(1, 349)       =   41.7622     P = 0.0000
    Last edited by Luis Mijares Castaneda; 21 Mar 2026, 14:03.

  • #2
    there are a lot of definitions of "effect size" - to which are you referring here? many are available via various kinds of regression models (yes, even for your situation) but without knowing what you are asking about, specific advice is hard to provide

    Comment


    • #3
      I'm interested in seeing whether there was a matched case version of Cramer's V

      Comment


      • #4
        So some form of measure of associaiton/correlation between the two matched variables is what I'm interested in

        Comment


        • #5
          Would you recommend a repeated-measures ANOVA?

          Comment


          • #6
            Originally posted by Luis Mijares Castaneda View Post
            I have two variables . . ., the ai_mh_Y2 variable . . . (Yes/No)", in Wave 2 of a longitudinal survey. The ai_mh_Y4 variable is the follow-up. . . (Yes/No)" in Wave 4. . . . I surveyed the variables, but if a test is avaliable it doesn't have to account for the survey weights.
            Originally posted by Luis Mijares Castaneda View Post
            I'm interested in seeing whether there was a matched case version of Cramer's V
            I might not understand what you're after, but if you're willing to forgo accounting for the survey weights, then wouldn't the 2 × 2 tabulation give you what you want?
            Code:
            tabulate ai_mh_Y2 ai_mh_Y4, V
            Otherwise, if you want to (i) accommodate the survey design, (ii) get some measure of the strength of association of a pair of binary variables and (iii) have the measure lie within -1 and +1 (à la Cramér's V), then how about fitting a probit regression model and back-transforming (Fisher)?
            Code:
            webuse nhanes2f
            svy: probit highbp i.sex
            nlcom tanh(_b[2.sex]), df(`e(df_r)')
            Albeit a naive approach, but it ticks all of your boxes.

            Comment


            • #7
              Originally posted by Joseph Coveney View Post
              . . . it ticks all of your boxes.
              I'd forgotten that gsem allows the svy: prefix; you can compute the tetrachoric correlation coefficient as an alternative that will also satisfy your implied requirements..
              Code:
              webuse nhanes2f
              generate byte fem = sex == 2
              svy: gsem ///
                  (highbp@1 <- F1, probit) ///
                  (fem@1 <- F2, probit), ///
                      variance(F1@1 F2@1) covariance(F1*F2) ///
                      nocnsreport nodvheader nolog
              lincom _b[/:cov(F1, F2)] / 2

              Comment

              Working...
              X