Translating multilevel modeling in SAS to Stata

SeungYong Han

Join Date: Jul 2015

Posts: 53
#1

Translating multilevel modeling in SAS to Stata

12 Jun 2019, 12:44

Hello,

I have a hierarchical data with subjects nested within groups. Same subjects participated in seven events, and those seven events were grouped into 3 phases for some reason. We measured the outcome of our interest prior to and following each event, so we have pre and post data at each event.

Therefore, my understanding of the data structure is that
; groups >> subjects >> phases >> events >> pre/post

Our main research interest is if there is any statistical difference in the outcome between pre and post by phases.

Here are my SAS codes first.

Code:

proc mixed data=DATASET; class GROUP SUBJECTID PHASE EVENTNO PREPOST; model OUTCOME= PHASE|PREPOST GROUP / solution ddfm=satterthwaite; random intercept / subject=SUBJECTID; random intercept / subject=PHASE(SUBJECTID); random intercept / subject=EVENTNO(PHASE); repeated PREPOST / subject=EVENTNO(PHASE*SUBJECTID) type=un; run;

And my Stata codes.

Code:

mixed OUTCOME PHASE##PREPOST i.GROUP || SUBJECTID: || PHASE: || EVENTNO:, residuals(unstr, t(PREPOST)) variance

I get similar but different results, and I am trying to figure out why.
Is there any problem with my codes? Or are there any fundamental differences between SAS proc mixed and Stata mixed?
Perhaps, my modeling is wrong?

I would really appreciate any comments.
Thanks.
Tags: None
Joseph Coveney

Join Date: Apr 2014

Posts: 4399
#2

12 Jun 2019, 17:14

Originally posted by SeungYong Han View Post

I get similar but different results

Try adding add something like the following.

Code:

mixed OUTCOME i.PHASE##i.PREPOST i.GROUP || SUBJECTID: || PHASE: || EVENTNO:, /// reml dfmethod(satterthwaite) residuals(unstructured, t(PREPOST))
1 like
Comment
SeungYong Han

Join Date: Jul 2015

Posts: 53
#3

13 Jun 2019, 07:42

Thank you. It makes sense, so I tried the code, but Stata gives me an error message as it computes df: AKLMAWIB not found. AKLMAWIB is one of the subject IDs, and I am not sure what this tells me. Any thoughts?
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4399
#4

13 Jun 2019, 16:54

Originally posted by SeungYong Han View Post

AKLMAWIB is one of the subject IDs. Any thoughts?

Code:

encode SUBJECTID, generate(sid) mixed OUTCOME i.PHASE##i.PREPOST i.GROUP || sid: || PHASE: || EVENTNO:, /// reml dfmethod(satterthwaite) residuals(unstructured, t(PREPOST))
1 like
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4399
#5

13 Jun 2019, 18:17

On further thought, mixed can handle string IDs without any problem under the same circumstances as you have. See below for demonstration.

Is is possible that your dataset is corrupted in some way, for example, that there are hidden or invisible characters contaminating some of the values of the string participant ID variable? Or maybe SAS padded the variable with spaces in order to keep the string length constant.

Try checking that your participant ID variable is of the same length and that you don't have seemingly identical participant IDs that are on separate lines after a data manipulation like

Code:

assert SUBJECTID == strtrim(stritrim(SUBJECTID)) contract SUBJECTID list

Demonstration that mixed is not fazed by string ID variables under these circumstances:

.ÿ
.ÿversionÿ15.1

.ÿ
.ÿclearÿ*

.ÿ
.ÿsetÿseedÿ`=strreverse("1503034")'

.ÿ
.ÿquietlyÿsetÿobsÿ100

.ÿgenerateÿstrÿpidÿ=ÿchar(runiformint(65,ÿ90))

.ÿforvaluesÿiÿ=ÿ1/7ÿ{
ÿÿ2.ÿÿÿÿÿÿÿÿÿquietlyÿreplaceÿpidÿ=ÿpidÿ+ÿchar(runiformint(65,ÿ90))
ÿÿ3.ÿ}

.ÿisidÿpid

.ÿgenerateÿdoubleÿpid_uÿ=ÿrnormal()

.ÿ
.ÿgenerateÿbyteÿgrpÿ=ÿmod(_n,ÿ2)

.ÿ
.ÿquietlyÿexpandÿ7

.ÿbysortÿpid:ÿgenerateÿbyteÿevtÿ=ÿ_n

.ÿgenerateÿbyteÿphaÿ=ÿ1

.ÿforeachÿevtÿinÿ3ÿ5ÿ{
ÿÿ2.ÿÿÿÿÿÿÿÿÿquietlyÿreplaceÿphaÿ=ÿphaÿ+ÿ1ÿifÿevtÿ>=ÿ`evt'
ÿÿ3.ÿ}

.ÿ
.ÿquietlyÿexpandÿ2

.ÿbysortÿpidÿevt:ÿgenerateÿbyteÿposÿ=ÿ_n

.ÿ
.ÿgenerateÿdoubleÿoutÿ=ÿpid_uÿ+ÿrnormal()

.ÿ
.ÿmixedÿoutÿi.pha##i.posÿi.grpÿ||ÿpid:ÿ||ÿpha:ÿ||ÿevt:ÿ,ÿ///
>ÿÿÿÿÿÿÿÿÿremlÿdfmethod(satterthwaite)ÿresiduals(unstructured,ÿt(pos))ÿ///
>ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿnolrtestÿnogroupÿnolog

Mixed-effectsÿREMLÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿ1,400
DFÿmethod:ÿSatterthwaiteÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿDF:ÿÿÿÿÿÿÿÿÿÿÿminÿ=ÿÿÿÿÿÿ98.00
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿ1,004.35
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿ2,099.19

ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿF(6,ÿÿÿ277.38)ÿÿÿÿ=ÿÿÿÿÿÿÿ0.34
Logÿrestricted-likelihoodÿ=ÿ-2146.2934ÿÿÿÿÿÿÿÿÿÿProbÿ>ÿFÿÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.9167

------------------------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿoutÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿtÿÿÿÿP>|t|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
-------------+----------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿphaÿ|
ÿÿÿÿÿÿÿÿÿÿ2ÿÿ|ÿÿ-.0220462ÿÿÿ.0995924ÿÿÿÿ-0.22ÿÿÿ0.825ÿÿÿÿ-.2177456ÿÿÿÿ.1736533
ÿÿÿÿÿÿÿÿÿÿ3ÿÿ|ÿÿÿ-.069624ÿÿÿ.0912376ÿÿÿÿ-0.76ÿÿÿ0.446ÿÿÿÿ-.2489877ÿÿÿÿ.1097397
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿ2.posÿ|ÿÿ-.1238876ÿÿÿ.1039754ÿÿÿÿ-1.19ÿÿÿ0.234ÿÿÿÿ-.3277933ÿÿÿÿÿ.080018
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿpha#posÿ|
ÿÿÿÿÿÿÿÿ2ÿ2ÿÿ|ÿÿÿÿÿ.11845ÿÿÿ.1470435ÿÿÿÿÿ0.81ÿÿÿ0.421ÿÿÿÿ-.1699161ÿÿÿÿ.4068162
ÿÿÿÿÿÿÿÿ3ÿ2ÿÿ|ÿÿÿ.1164273ÿÿÿ.1342317ÿÿÿÿÿ0.87ÿÿÿ0.386ÿÿÿÿ-.1468462ÿÿÿÿ.3797007
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿ1.grpÿ|ÿÿ-.0261598ÿÿÿ.2318876ÿÿÿÿ-0.11ÿÿÿ0.910ÿÿÿÿ-.4863333ÿÿÿÿ.4340136
ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ.0626194ÿÿÿ.1764536ÿÿÿÿÿ0.35ÿÿÿ0.723ÿÿÿÿ-.2864543ÿÿÿÿ.4116931
------------------------------------------------------------------------------

------------------------------------------------------------------------------
ÿÿRandom-effectsÿParametersÿÿ|ÿÿÿEstimateÿÿÿStd.ÿErr.ÿÿÿÿÿ[95%ÿConf.ÿInterval]
-----------------------------+------------------------------------------------
pid:ÿIdentityÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(_cons)ÿ|ÿÿÿ1.273358ÿÿÿ.1922916ÿÿÿÿÿÿ.9471293ÿÿÿÿ1.711952
-----------------------------+------------------------------------------------
pha:ÿIdentityÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(_cons)ÿ|ÿÿÿ.0176309ÿÿÿ.0255024ÿÿÿÿÿÿ.0010353ÿÿÿÿ.3002621
-----------------------------+------------------------------------------------
evt:ÿIdentityÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(_cons)ÿ|ÿÿÿ.0956039ÿÿÿ7.072564ÿÿÿÿÿÿ1.02e-64ÿÿÿÿ8.92e+61
-----------------------------+------------------------------------------------
Residual:ÿUnstructuredÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(e1)ÿ|ÿÿÿ.8609985ÿÿÿ7.072762ÿÿÿÿÿÿ8.76e-08ÿÿÿÿÿ8458373
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(e2)ÿ|ÿÿÿÿ.938074ÿÿÿ7.072785ÿÿÿÿÿÿ3.58e-07ÿÿÿÿÿ2454885
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿcov(e1,e2)ÿ|ÿÿ-.1815527ÿÿÿ7.072655ÿÿÿÿÿÿ-14.0437ÿÿÿÿÿ13.6806
------------------------------------------------------------------------------

.ÿ
.ÿexit

endÿofÿdo-file

.
1 like
Comment
SeungYong Han

Join Date: Jul 2015

Posts: 53
#6

14 Jun 2019, 07:55

Thank you so much for your help! I tried all, and here is the summary.

1) It doesn't seem like the string ID variable (SUBJECTID) has any problem.
2) But, encoding the ID variable solved the problem.
3) Now, Stata fails to estimate the SE and 95% CI of EVENTNO.
4) When I run the model with the original string id (SUBJECTID) and without "reml" and "dfmethod(satterhwaite)", Stata successfully computes everything including SE and 95% CI of EVENTNO.
5) When I run the model with the original string id (SUBJECTID) and only with "reml" option, Stata gives me numbers except for SE and 95% CI of EVENTNO.

This is very strange to me. Perhaps, some kind of Stata coding issue when it comes to dfmethod(satterthwaite)?
I am not sure.
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4399
#7

14 Jun 2019, 08:05

So SAS is able to provide everything, including standard errors and Wald confidence intervals for the EVENTNO random effect? Is there any chance that you could attach the dataset to a post? (Put the dataset into a .zip file, and then rename the file extension from .zip to .txt before attaching to the post using the paper and paperclip icon to the upper right, just next to the blue-colored, underline, upper-case letter A.)
Comment
SeungYong Han

Join Date: Jul 2015

Posts: 53
#8

14 Jun 2019, 08:10

Unfortunately, I cannot share the data set with anyone. Actually, SAS only gives me the estimate, not SE or CI, by default. Let me play with SAS and Stata a bit more and post some results if possible soon. Thanks!
Comment

Announcement

Translating multilevel modeling in SAS to Stata

Comment

Comment

Comment

Comment

Comment

Comment

Comment