Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • is it possible to do a repeated Measures ANOVA without case identifier

    Hi all,

    I am currently using a Qualtrics platform survey to measure the perceived social consensus across four different scenarios (i.e. What percentage of people do you believe find X to be acceptable behavior). Participants then use a slider bar to indicate anywhere between 0% (No one finds the behavior acceptable) and 100% (everyone finds this behavior acceptable). I would like to use a repeated measures ANOVA to test for a significant difference across the four individual scenarios.

    While I do have a unique identifier for each participant, it is only listed once, rather than four times for each participant. So in the data editor, rather than have a column with a unique identifier four times, a column identifying a scenario, and then the score for each scenario (Image 1). I have a column with the identifier once, and then the four scores as unique variables in their own column (image 2).

    Rather than having to manually move around the data in editor/excel, entering each identifier three additional times, and then creating a variable to identify each scenario score, is there a way stata can do an RMANOVA?

    Obviously I can do the time consuming alternative, but was hoping for a quicker fix.

    Best wishes,
    William
    Attached Files

  • #2
    Not sure I understand your question completely, but I sense that you have data in what is called 'long' format and you want to bring it into 'wide' format -- have a look at the -reshape- command, which does just that.

    Comment


    • #3
      -search rmanova- reveals that it is a user-written program dating to 1998. While it may be perfectly fine, if I were you, I'd use a built-in routine supplied with a contemporary version of Stata. -search repeated measures- reveals that the built-in -anova- command offers a -repeated- option. This command, too, requires a -reshape- to the wide format. I would also think that, with an appropriate choice of options (beyond my ken), the built-in -mixed- or -xtreg- might give you the same results, but those would work on long format data, such as you have. (Long format is almost always preferred in Stata.)

      Comment


      • #4
        Originally posted by Klaus Steitzel View Post
        Not sure I understand your question completely, but I sense that you have data in what is called 'long' format and you want to bring it into 'wide' format -- have a look at the -reshape- command, which does just that.
        I read it the other way: I think the data are currently in wide format, and William wants to restructure to LONG. In both cases, -reshape- is what is needed!

        I have another question though: Is there a way to estimate a -fracreg- type model that takes into account the correlated nature of the repeated measures (e.g., by using generalized estimating equations)? If so, I've not yet found it with my very quick and preliminary searches.

        Cheers,
        Bruce
        --
        Bruce Weaver
        Email: [email protected]
        Version: Stata/MP 18.5 (Windows)

        Comment


        • #5
          Yes, I misspoke/misunderstood, per Bruce Weaver 's comment. It sounds like William *does* have the wide format already. However, if the built-in -anova- with the repeated option is satisfactory, the wide format should work, so no -reshape- would be needed.

          Comment


          • #6
            Hi Mike. The first example in this FAQ shows -anova- with repeated option needing the data in long format.

            I still think a repeated measures version of -fracreg- (or -betareg- if 0 < Y < 1) would be preferable to ANOVA. But I don't know if those models can be estimated via Stata.
            --
            Bruce Weaver
            Email: [email protected]
            Version: Stata/MP 18.5 (Windows)

            Comment


            • #7
              Hi Klaus Steitzel Mike Lacy and Bruce Weaver

              Thank you so much for your assistance. I do believe the reshape command is what will be needed. I will check if the built in anova function will allow for the wide format before re-shaping. Then will look into what you have suggested Bruce Weaver. Either way, I will update with progress.

              Thank you once again!

              Best wishes,
              William

              Comment


              • #8
                The built-in -anova, .... repeated- *does* want the wide format. Go to -help anova-, and then click on the highlighted "Remarks and Examples," and then go to the highlighted "Repeated-Measures Anova," which describes an example in wide format.

                Comment


                • #9
                  Mike, I'm not finding the example you refer too. What I find in Example 10 for -manova- shows a wide file format for -manova-, but a long file format for doing the same analysis with -anova- and repeated. Here's a modified version of example 10 with some extra bits thrown in. And FWIW, the -manova- approach seems very clunky to me!


                  Code:
                  // Example 10:  MANOVA with repeated-measures data from
                  // https://www.stata.com/manuals/mvmanova.pdf
                  clear *
                  use http://www.stata-press.com/data/r15/nobetween
                  list
                  // manova must be tricked into fitting a constant-only model.  
                  // To do this, you generate a variable equal to one, use that variable
                  // as the single term in your manova, and then specify the
                  // noconstant option.  
                  
                  generate mycons = 1
                  manova test1 test2 test3 = mycons, noconstant
                  // The test produced directly with manova is not interesting.
                  // It is testing the hypothesis that the three test score means are zero.
                  // The test produced by manovatest (see below) is of interest.  
                  // From the contrasts in the matrix c, you produce a test that there
                  // is a difference between the test1, test2, and test3 scores.
                  mat c = (1,0,-1\0,1,-1)
                  manovatest mycons, ytransform(c)
                  
                  // Compare that to -anova- with repeated option on the same data.
                  // But note that -anova- needs a long file format.
                  reshape long test, i(subject) j(testnum)
                  anova test subject testnum, repeated(testnum)
                  // None of the p-values from -anova- with repeated option match exactly
                  // the p-value from -manovatest-, but they are close.
                  --
                  Bruce Weaver
                  Email: [email protected]
                  Version: Stata/MP 18.5 (Windows)

                  Comment


                  • #10
                    In #4, I wrote:
                    I have another question though: Is there a way to estimate a -fracreg- type model that takes into account the correlated nature of the repeated measures (e.g., by using generalized estimating equations)?
                    After having that question on the back burner all day, I wonder if -xtgee- with logit link and binomial error distribution would be appropriate. Does that work when the outcome is a proportion? Here's an example:

                    Code:
                    // Use data from first example at
                    // https://www.stata.com/support/faqs/statistics/repeated-measures-anova/#ex1rep
                    clear *
                    use http://www.stata-press.com/data/r14/t43
                    
                    // Generate a new proportional outcome variable = score / 100
                    generate prop = score/100
                    
                    tabdisp person drug, cellvar(prop)
                    list, sepby(person)
                    // The data are in long format
                    tabstat prop, statistics( mean ) by(drug)
                    anova prop person drug, repeated(drug)
                    
                    // Now use -xtgee- with logit link and binomial error distribution
                    xtgee prop i.drug, i(person) corr(exch) family(binomial) link(logit)
                    xtgee, eform
                    contrast drug
                    
                    display _newline ///
                    "Drug 1 odds = " .264/(1-.264) _newline ///
                    "Drug 2 odds = " .256/(1-.256) _newline ///
                    "Drug 3 odds = " .156/(1-.156) _newline ///
                    "Drug 4 odds = " .320/(1-.320) _newline
                    
                    display _newline ///
                    "Drug 2 OR = ".34408602/.35869565 _newline ///
                    "Drug 3 OR = ".18483412/.35869565 _newline ///
                    "Drug 4 OR = ".47058824/.35869565
                    What I find curious about this example is that the drug effect has a very low p-value (p < 0.0001) from the ANOVA versus p = 0.5963 from -xtgee-. Off the top of my head, I would not have expected such a large difference.
                    --
                    Bruce Weaver
                    Email: [email protected]
                    Version: Stata/MP 18.5 (Windows)

                    Comment


                    • #11
                      Bruce Weaver is absolutely right here, and I'm clearly not at my sharpest today. The repeated measures example in the manual entry for ANOVA *do* presume long format.. The example I was looking at was on p. 47 of the r.pdf Reference Manual. The display of the data in the example was in wide format, but this is not so in the actual data set.

                      Comment


                      • #12
                        Originally posted by Bruce Weaver View Post
                        Is there a way to estimate a -fracreg- type model that takes into account the correlated nature of the repeated measure
                        You can use the vce(cluster participant_id) option.

                        But with just a handful of participants like what the OP appears to have, I'd stick with ANOVA.

                        In a pilot simulation (see below) with the OP's displayed number of participants and repeated measurements, the classical arcsine square root transformation for ANOVA has the best performance both in test size and power.
                        Code:
                        version 15.1
                        
                        clear *
                        
                        set seed `=strreverse("1481201")'
                        
                        program define simem, rclas
                            version 15.1
                            syntax , [Delta(real 0)]
                        
                            drop _all
                            set obs 6
                            generate byte pid = _n
                            generate double u = rnormal()
                        
                            expand 4
                            bysort pid: generate byte scn = _n - 1
                        
                            generate double psc = rnormal(cond(scn, 0, `delta'), 1)
                            replace psc = normal(u + psc)
                        
                            tempname dtp asi fri
                        
                            // Damn-the-torpedos ANOVA
                            anova psc pid scn
                            testparm i.scn
                            scalar define `dtp' = r(p)
                        
                            // Classical transformation ANOVA
                            generate double aps = 2 * asin(sqrt(psc))
                            anova aps pid scn
                            testparm i.scn
                            scalar define `asi' = r(p)
                        
                            // Friedman's test
                            emh psc scn, anova strata(pid) transformation(rank)
                            scalar define `fri' = r(p)
                        
                            // -fracreg-
                            fracreg probit psc i.scn, vce(cluster pid)
                            testparm i.scn
                            return scalar fra = r(p)
                            return scalar dtp = `dtp'
                            return scalar asi = `asi'
                            return scalar fri = `fri'
                        end
                        
                        foreach delta in 0 1 {
                            display in smcl as text _newline(1) "`=cond(!`delta', "Test size", "Power")'"
                            quietly simulate dtp = r(dtp) asi = r(asi) fri = r(fri) fra = r(fra), ///
                                reps(1000) nodots: simem , d(`delta')
                            foreach var of varlist _all {
                                generate byte p_`var' = `var' < 0.05
                            }
                            summarize p_*
                        }
                        
                        exit
                        Method Test Size Power
                        Untransformed ANOVA 0.045 0.267
                        Arcsine Square Root ANOVA 0.051 0.303
                        Friedman's Test 0.036 0.225
                        fracreg , cluster() 0.360 So what

                        Friedman's test is by the user-written command emh, which is available from SSC. Power was at a single alternative hypothesis of an increment of probit 1 SD in the first measurement versus the three others.

                        For the angular transformation, the OP will need to decide an increment to add or deduct from scores where participants left the slider at zero or slammed it all the way over. It shouldn't be difficult, because the software is capable of finite detection, and that specification should be available in documentation from the vendor.

                        Comment


                        • #13
                          Thanks Joseph. I wonder if you would consider the logit transformation in place of the arcsine square root. Some authors have argued that it's time to put the latter to bed. E.g.,
                          Warton, D. I. and Hui, F. K. (2011), The arcsine is asinine: the analysis of proportions in ecology. Ecology, 92: 3-10. doi:10.1890/10-0340.1
                          When I replaced 2 * asin(sqrt(psc)) with logit(psc) in your code, Test Size for that one was .061 and Power was .313.

                          Cheers,
                          Bruce
                          --
                          Bruce Weaver
                          Email: [email protected]
                          Version: Stata/MP 18.5 (Windows)

                          Comment


                          • #14
                            Originally posted by Joseph Coveney View Post
                            You can use the vce(cluster participant_id) option.
                            Using the data I employed earlier, I get the same results with the following two commands:
                            Code:
                            fracreg logit prop i.drug, vce(cluster person)
                            xtgee prop i.drug, i(person) corr(exch) family(binomial) link(logit) vce(robust)
                            Notice that unlike my earlier attempt at using -xtgee- (see #10), I used the vce(robust) option this time.

                            Here's the complete example, for anyone who is interested.

                            Code:
                            // Use data from first example at
                            // https://www.stata.com/support/faqs/statistics/repeated-measures-anova/#ex1rep
                            clear *
                            use http://www.stata-press.com/data/r14/t43
                            // Generate a new proportional outcome variable = score / 100
                            generate prop = score/100
                            // [1] -fracreg- with vce(cluster pid), first probit then logit
                            fracreg probit prop i.drug, vce(cluster person)
                            fracreg logit prop i.drug, vce(cluster person)
                            fracreg, or
                            // Now use -xtgee- with logit link and binomial error distribution;
                            // also use vce(robust)
                            xtgee prop i.drug, i(person) corr(exch) family(binomial) link(logit) vce(robust)
                            // Results from -xtgee- match those from -fracreg- with logit
                            xtgee, eform
                            And here are the tables of coefficients from the two models:

                            Code:
                            . fracreg logit prop i.drug, vce(cluster person)
                            
                            --- snip some output ---
                            
                                                             (Std. Err. adjusted for 5 clusters in person)
                            ------------------------------------------------------------------------------
                                         |               Robust
                                    prop |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                            -------------+----------------------------------------------------------------
                                    drug |
                                      2  |  -.0415826   .0823389    -0.51   0.614    -.2029638    .1197987
                                      3  |  -.6630155   .1100167    -6.03   0.000    -.8786441   -.4473868
                                      4  |   .2715092   .0493864     5.50   0.000     .1747138    .3683047
                                         |
                                   _cons |  -1.025281   .2017037    -5.08   0.000    -1.420613    -.629949
                            ------------------------------------------------------------------------------
                            
                            . xtgee prop i.drug, i(person) corr(exch) family(binomial) link(logit) vce(robust)
                            
                            --- snip some output ---
                                                             (Std. Err. adjusted for clustering on person)
                            ------------------------------------------------------------------------------
                                         |               Robust
                                    prop |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                            -------------+----------------------------------------------------------------
                                    drug |
                                      2  |  -.0415826   .0823389    -0.51   0.614    -.2029638    .1197987
                                      3  |  -.6630155   .1100167    -6.03   0.000    -.8786441   -.4473868
                                      4  |   .2715092   .0493864     5.50   0.000     .1747138    .3683047
                                         |
                                   _cons |  -1.025281   .2017037    -5.08   0.000    -1.420613    -.629949
                            ------------------------------------------------------------------------------
                            --
                            Bruce Weaver
                            Email: [email protected]
                            Version: Stata/MP 18.5 (Windows)

                            Comment


                            • #15
                              Originally posted by Bruce Weaver View Post
                              I wonder if you would consider the logit transformation in place of the arcsine square root. Test Size for that one was .061 and Power was .313.
                              Seems as all it did was to increase the Type I error rate by one percentage point (managed to catch myself this time) in both columns of the table.

                              I don't have access to the article you cite, sorry.

                              Comment

                              Working...
                              X