Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Translating multilevel modeling in SAS to Stata

    Hello,

    I have a hierarchical data with subjects nested within groups. Same subjects participated in seven events, and those seven events were grouped into 3 phases for some reason. We measured the outcome of our interest prior to and following each event, so we have pre and post data at each event.

    Therefore, my understanding of the data structure is that
    ; groups >> subjects >> phases >> events >> pre/post

    Our main research interest is if there is any statistical difference in the outcome between pre and post by phases.

    Here are my SAS codes first.
    Code:
        proc mixed data=DATASET;
            class GROUP SUBJECTID PHASE EVENTNO PREPOST;
            model OUTCOME=
                PHASE|PREPOST GROUP / solution ddfm=satterthwaite;
            random intercept / subject=SUBJECTID;
            random intercept / subject=PHASE(SUBJECTID);
            random intercept / subject=EVENTNO(PHASE);
            repeated PREPOST / subject=EVENTNO(PHASE*SUBJECTID) type=un;
        run;
    And my Stata codes.
    Code:
    mixed OUTCOME PHASE##PREPOST i.GROUP || SUBJECTID: || PHASE: || EVENTNO:, residuals(unstr, t(PREPOST)) variance
    I get similar but different results, and I am trying to figure out why.
    Is there any problem with my codes? Or are there any fundamental differences between SAS proc mixed and Stata mixed?
    Perhaps, my modeling is wrong?

    I would really appreciate any comments.
    Thanks.


  • #2
    Originally posted by SeungYong Han View Post
    I get similar but different results
    Try adding add something like the following.
    Code:
    mixed OUTCOME i.PHASE##i.PREPOST i.GROUP || SUBJECTID: || PHASE: || EVENTNO:, ///
        reml dfmethod(satterthwaite) residuals(unstructured, t(PREPOST))

    Comment


    • #3
      Thank you. It makes sense, so I tried the code, but Stata gives me an error message as it computes df: AKLMAWIB not found. AKLMAWIB is one of the subject IDs, and I am not sure what this tells me. Any thoughts?

      Comment


      • #4
        Originally posted by SeungYong Han View Post
        AKLMAWIB is one of the subject IDs. Any thoughts?
        Code:
        encode SUBJECTID, generate(sid)
        mixed OUTCOME i.PHASE##i.PREPOST i.GROUP || sid: || PHASE: || EVENTNO:, ///
            reml dfmethod(satterthwaite) residuals(unstructured, t(PREPOST))

        Comment


        • #5
          On further thought, mixed can handle string IDs without any problem under the same circumstances as you have. See below for demonstration.

          Is is possible that your dataset is corrupted in some way, for example, that there are hidden or invisible characters contaminating some of the values of the string participant ID variable? Or maybe SAS padded the variable with spaces in order to keep the string length constant.

          Try checking that your participant ID variable is of the same length and that you don't have seemingly identical participant IDs that are on separate lines after a data manipulation like
          Code:
          assert SUBJECTID == strtrim(stritrim(SUBJECTID))
          contract SUBJECTID
          list

          Demonstration that mixed is not fazed by string ID variables under these circumstances:

          .ÿ
          .ÿversionÿ15.1

          .ÿ
          .ÿclearÿ*

          .ÿ
          .ÿsetÿseedÿ`=strreverse("1503034")'

          .ÿ
          .ÿquietlyÿsetÿobsÿ100

          .ÿgenerateÿstrÿpidÿ=ÿchar(runiformint(65,ÿ90))

          .ÿforvaluesÿiÿ=ÿ1/7ÿ{
          ÿÿ2.ÿÿÿÿÿÿÿÿÿquietlyÿreplaceÿpidÿ=ÿpidÿ+ÿchar(runiformint(65,ÿ90))
          ÿÿ3.ÿ}

          .ÿisidÿpid

          .ÿgenerateÿdoubleÿpid_uÿ=ÿrnormal()

          .ÿ
          .ÿgenerateÿbyteÿgrpÿ=ÿmod(_n,ÿ2)

          .ÿ
          .ÿquietlyÿexpandÿ7

          .ÿbysortÿpid:ÿgenerateÿbyteÿevtÿ=ÿ_n

          .ÿgenerateÿbyteÿphaÿ=ÿ1

          .ÿforeachÿevtÿinÿ3ÿ5ÿ{
          ÿÿ2.ÿÿÿÿÿÿÿÿÿquietlyÿreplaceÿphaÿ=ÿphaÿ+ÿ1ÿifÿevtÿ>=ÿ`evt'
          ÿÿ3.ÿ}

          .ÿ
          .ÿquietlyÿexpandÿ2

          .ÿbysortÿpidÿevt:ÿgenerateÿbyteÿposÿ=ÿ_n

          .ÿ
          .ÿgenerateÿdoubleÿoutÿ=ÿpid_uÿ+ÿrnormal()

          .ÿ
          .ÿmixedÿoutÿi.pha##i.posÿi.grpÿ||ÿpid:ÿ||ÿpha:ÿ||ÿevt:ÿ,ÿ///
          >ÿÿÿÿÿÿÿÿÿremlÿdfmethod(satterthwaite)ÿresiduals(unstructured,ÿt(pos))ÿ///
          >ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿnolrtestÿnogroupÿnolog

          Mixed-effectsÿREMLÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿ1,400
          DFÿmethod:ÿSatterthwaiteÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿDF:ÿÿÿÿÿÿÿÿÿÿÿminÿ=ÿÿÿÿÿÿ98.00
          ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿ1,004.35
          ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿ2,099.19

          ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿF(6,ÿÿÿ277.38)ÿÿÿÿ=ÿÿÿÿÿÿÿ0.34
          Logÿrestricted-likelihoodÿ=ÿ-2146.2934ÿÿÿÿÿÿÿÿÿÿProbÿ>ÿFÿÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.9167

          ------------------------------------------------------------------------------
          ÿÿÿÿÿÿÿÿÿoutÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿtÿÿÿÿP>|t|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
          -------------+----------------------------------------------------------------
          ÿÿÿÿÿÿÿÿÿphaÿ|
          ÿÿÿÿÿÿÿÿÿÿ2ÿÿ|ÿÿ-.0220462ÿÿÿ.0995924ÿÿÿÿ-0.22ÿÿÿ0.825ÿÿÿÿ-.2177456ÿÿÿÿ.1736533
          ÿÿÿÿÿÿÿÿÿÿ3ÿÿ|ÿÿÿ-.069624ÿÿÿ.0912376ÿÿÿÿ-0.76ÿÿÿ0.446ÿÿÿÿ-.2489877ÿÿÿÿ.1097397
          ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
          ÿÿÿÿÿÿÿ2.posÿ|ÿÿ-.1238876ÿÿÿ.1039754ÿÿÿÿ-1.19ÿÿÿ0.234ÿÿÿÿ-.3277933ÿÿÿÿÿ.080018
          ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
          ÿÿÿÿÿpha#posÿ|
          ÿÿÿÿÿÿÿÿ2ÿ2ÿÿ|ÿÿÿÿÿ.11845ÿÿÿ.1470435ÿÿÿÿÿ0.81ÿÿÿ0.421ÿÿÿÿ-.1699161ÿÿÿÿ.4068162
          ÿÿÿÿÿÿÿÿ3ÿ2ÿÿ|ÿÿÿ.1164273ÿÿÿ.1342317ÿÿÿÿÿ0.87ÿÿÿ0.386ÿÿÿÿ-.1468462ÿÿÿÿ.3797007
          ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
          ÿÿÿÿÿÿÿ1.grpÿ|ÿÿ-.0261598ÿÿÿ.2318876ÿÿÿÿ-0.11ÿÿÿ0.910ÿÿÿÿ-.4863333ÿÿÿÿ.4340136
          ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ.0626194ÿÿÿ.1764536ÿÿÿÿÿ0.35ÿÿÿ0.723ÿÿÿÿ-.2864543ÿÿÿÿ.4116931
          ------------------------------------------------------------------------------

          ------------------------------------------------------------------------------
          ÿÿRandom-effectsÿParametersÿÿ|ÿÿÿEstimateÿÿÿStd.ÿErr.ÿÿÿÿÿ[95%ÿConf.ÿInterval]
          -----------------------------+------------------------------------------------
          pid:ÿIdentityÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|
          ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(_cons)ÿ|ÿÿÿ1.273358ÿÿÿ.1922916ÿÿÿÿÿÿ.9471293ÿÿÿÿ1.711952
          -----------------------------+------------------------------------------------
          pha:ÿIdentityÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|
          ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(_cons)ÿ|ÿÿÿ.0176309ÿÿÿ.0255024ÿÿÿÿÿÿ.0010353ÿÿÿÿ.3002621
          -----------------------------+------------------------------------------------
          evt:ÿIdentityÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|
          ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(_cons)ÿ|ÿÿÿ.0956039ÿÿÿ7.072564ÿÿÿÿÿÿ1.02e-64ÿÿÿÿ8.92e+61
          -----------------------------+------------------------------------------------
          Residual:ÿUnstructuredÿÿÿÿÿÿÿ|
          ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(e1)ÿ|ÿÿÿ.8609985ÿÿÿ7.072762ÿÿÿÿÿÿ8.76e-08ÿÿÿÿÿ8458373
          ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(e2)ÿ|ÿÿÿÿ.938074ÿÿÿ7.072785ÿÿÿÿÿÿ3.58e-07ÿÿÿÿÿ2454885
          ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿcov(e1,e2)ÿ|ÿÿ-.1815527ÿÿÿ7.072655ÿÿÿÿÿÿ-14.0437ÿÿÿÿÿ13.6806
          ------------------------------------------------------------------------------

          .ÿ
          .ÿexit

          endÿofÿdo-file


          .

          Comment


          • #6
            Thank you so much for your help! I tried all, and here is the summary.

            1) It doesn't seem like the string ID variable (SUBJECTID) has any problem.
            2) But, encoding the ID variable solved the problem.
            3) Now, Stata fails to estimate the SE and 95% CI of EVENTNO.
            4) When I run the model with the original string id (SUBJECTID) and without "reml" and "dfmethod(satterhwaite)", Stata successfully computes everything including SE and 95% CI of EVENTNO.
            5) When I run the model with the original string id (SUBJECTID) and only with "reml" option, Stata gives me numbers except for SE and 95% CI of EVENTNO.

            This is very strange to me. Perhaps, some kind of Stata coding issue when it comes to dfmethod(satterthwaite)?
            I am not sure.




            Comment


            • #7
              So SAS is able to provide everything, including standard errors and Wald confidence intervals for the EVENTNO random effect? Is there any chance that you could attach the dataset to a post? (Put the dataset into a .zip file, and then rename the file extension from .zip to .txt before attaching to the post using the paper and paperclip icon to the upper right, just next to the blue-colored, underline, upper-case letter A.)

              Comment


              • #8
                Unfortunately, I cannot share the data set with anyone. Actually, SAS only gives me the estimate, not SE or CI, by default. Let me play with SAS and Stata a bit more and post some results if possible soon. Thanks!

                Comment

                Working...
                X