Stata's mixed command*- adjusting for confounding variables

amanda watson

Join Date: Oct 2022

Posts: 10
#1

Stata's mixed command*- adjusting for confounding variables

24 Oct 2022, 00:24

Hello,
I am trying to run a linear mixed model to test the within-subject difference in activity behaviours (eg sleep, sedentary time etc) between 2 different timepoints (school vs. holidays) and I want to adjust for relevant confounding variable (ie sex, ses, and bmi).
When I run the following command with no adjustment: (mixed genesed || wave: || schoolid: || id I get the same result for sedentary time (genesed) as I do when I include the confounding variables (mixed genesed i.time_new z_sep_composite t1_bmi_zscore t1_ageatanthropassessment i.sex_new || wave: || schoolid: || id. What am I doing wrong?
Tags: None
Joseph Coveney

Join Date: Apr 2014

Posts: 4406
#2

24 Oct 2022, 04:11

Originally posted by amanda watson View Post

I am trying to . . . to test the within-subject difference in activity . . . between 2 different timepoints . . .
[W]ith no adjustment: (mixed genesed || wave: || schoolid: || id: ) I get the same result for sedentary time (genesed) as I do when I include the confounding variables

I don't understand what you mean by "get the same results".

That model's fixed effects equation is empty, it fits just an intercept (the mean of sedentary time), and so it doesn't test for a difference in sedentary time within students between time points, does it?
Comment
amanda watson

Join Date: Oct 2022

Posts: 10
#3

24 Oct 2022, 04:29

Ah, my mistake. That should have read...mixed genesed i.time_new || wave: || schoolid: || id:

By "get the same result" i mean the coefficient for genesed doesnt change in the adjusted model. It remains exactly the same as it was in the unadjusted model. I just would have expected the coefficient to change in the adjusted model..

Last edited by amanda watson; 24 Oct 2022, 04:34.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#4

24 Oct 2022, 04:37

Amanda:
why not posting what you typed and, especially, what Stata gave you back via CODE delimiters (as per FAQ)?. Thanks.

Kind regards,
Carlo
(Stata 19.0)
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4406
#5

24 Oct 2022, 04:55

Originally posted by amanda watson View Post

Ah, my mistake. That should have read...mixed genesed i.time_new || wave: || schoolid: || id:

By "get the same result" i mean the coefficient for genesed doesnt change in the adjusted model. It remains exactly the same as it was in the unadjusted model. I just would have expected the coefficient to change in the adjusted model.

OK.

Answer to your question: you're not doing anything wrong. Time-invariant (baseline) confounders do not affect within-participant tests or regression coefficients when you have balanced data. See below (it's easiest to see with xtreg , fe, where the time-invariant confounders are explicitly dropped from the model).

.ÿ
.ÿversionÿ17.0

.ÿ
.ÿclearÿ*

.ÿ
.ÿ//ÿseedem
.ÿsetÿseedÿ172648042

.ÿ
.ÿ//ÿStudents
.ÿquietlyÿsetÿobsÿ24

.ÿgenerateÿbyteÿpidÿ=ÿ_n

.ÿgenerateÿdoubleÿpid_uÿ=ÿrnormal()

.ÿ
.ÿ//ÿBetweenÿ(time-invariant)ÿconfoundingÿvariables
.ÿgenerateÿbyteÿsexÿ=ÿmod(_n,ÿ2)

.ÿgenerateÿdoubleÿbmiÿ=ÿruniform(18,ÿ30)

.ÿgenerateÿbyteÿsesÿ=ÿmod(_n,ÿ3)

.ÿ
.ÿ//ÿTwoÿtimeÿpoints
.ÿquietlyÿexpandÿ2

.ÿbysortÿpid:ÿgenerateÿbyteÿtimÿ=ÿ_nÿ-ÿ1

.ÿ
.ÿ//ÿOutcome
.ÿgenerateÿdoubleÿoutÿ=ÿpid_uÿ+ÿtimÿ/ÿ2ÿ+ÿrnormal()

.ÿ
.ÿ*
.ÿ*ÿTime-invariantÿconfoundersÿdoÿnotÿaffectÿwithin-studentÿdifferenceÿbetwenÿtime
.ÿ*
.ÿmixedÿoutÿi.(timÿsexÿses)ÿc.bmiÿ||ÿpid:ÿ,ÿremlÿdfmethod(satterthwaite)ÿnolrtestÿnolog

Mixed-effectsÿREMLÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿÿ48
Groupÿvariable:ÿpidÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿgroupsÿÿ=ÿÿÿÿÿÿÿÿÿ24
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿObsÿperÿgroup:
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿminÿ=ÿÿÿÿÿÿÿÿÿÿ2
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿÿÿÿÿÿ2.0
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿÿÿÿÿÿÿÿ2
DFÿmethod:ÿSatterthwaiteÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿDF:ÿÿÿÿÿÿÿÿÿÿÿminÿ=ÿÿÿÿÿÿ19.00
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿÿÿÿ19.69
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿÿÿÿ23.00
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿF(5,ÿÿÿÿ19.67)ÿÿÿÿ=ÿÿÿÿÿÿÿ0.96
Logÿrestricted-likelihoodÿ=ÿÿ-79.86301ÿÿÿÿÿÿÿÿÿÿProbÿ>ÿFÿÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.4673

------------------------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿoutÿ|ÿCoefficientÿÿStd.ÿerr.ÿÿÿÿÿÿtÿÿÿÿP>|t|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
-------------+----------------------------------------------------------------
ÿÿÿÿÿÿÿ1.timÿ|ÿÿÿ.6106461ÿÿÿ.2822054ÿÿÿÿÿ2.16ÿÿÿ0.041ÿÿÿÿÿ.0268597ÿÿÿÿ1.194432
ÿÿÿÿÿÿÿ1.sexÿ|ÿÿÿ.0201515ÿÿÿ.5196147ÿÿÿÿÿ0.04ÿÿÿ0.969ÿÿÿÿ-1.067414ÿÿÿÿ1.107717
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿÿÿsesÿ|
ÿÿÿÿÿÿÿÿÿÿ1ÿÿ|ÿÿÿ.0844554ÿÿÿ.6509101ÿÿÿÿÿ0.13ÿÿÿ0.898ÿÿÿÿ-1.277915ÿÿÿÿ1.446826
ÿÿÿÿÿÿÿÿÿÿ2ÿÿ|ÿÿ-.0863226ÿÿÿÿ.669197ÿÿÿÿ-0.13ÿÿÿ0.899ÿÿÿÿ-1.486968ÿÿÿÿ1.314323
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿÿÿbmiÿ|ÿÿÿ.0111268ÿÿÿ.0808355ÿÿÿÿÿ0.14ÿÿÿ0.892ÿÿÿÿÿ-.158064ÿÿÿÿ.1803175
ÿÿÿÿÿÿÿ_consÿ|ÿÿ-.7065869ÿÿÿ2.209022ÿÿÿÿ-0.32ÿÿÿ0.753ÿÿÿÿ-5.327581ÿÿÿÿ3.914408
------------------------------------------------------------------------------

------------------------------------------------------------------------------
ÿÿRandom-effectsÿparametersÿÿ|ÿÿÿEstimateÿÿÿStd.ÿerr.ÿÿÿÿÿ[95%ÿconf.ÿinterval]
-----------------------------+------------------------------------------------
pid:ÿIdentityÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(_cons)ÿ|ÿÿÿ1.121887ÿÿÿ.5378071ÿÿÿÿÿÿ.4384325ÿÿÿÿ2.870753
-----------------------------+------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(Residual)ÿ|ÿÿÿ.9556786ÿÿÿ.2818142ÿÿÿÿÿÿ.5361746ÿÿÿÿ1.703403
------------------------------------------------------------------------------

.ÿmixedÿoutÿi.timÿ||ÿpid:ÿ,ÿremlÿdfmethod(satterthwaite)ÿnolrtestÿnolog

Mixed-effectsÿREMLÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿÿ48
Groupÿvariable:ÿpidÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿgroupsÿÿ=ÿÿÿÿÿÿÿÿÿ24
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿObsÿperÿgroup:
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿminÿ=ÿÿÿÿÿÿÿÿÿÿ2
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿÿÿÿÿÿ2.0
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿÿÿÿÿÿÿÿ2
DFÿmethod:ÿSatterthwaiteÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿDF:ÿÿÿÿÿÿÿÿÿÿÿminÿ=ÿÿÿÿÿÿ23.00
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿÿÿÿ30.32
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿÿÿÿ37.65
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿF(1,ÿÿÿÿ23.00)ÿÿÿÿ=ÿÿÿÿÿÿÿ4.68
Logÿrestricted-likelihoodÿ=ÿ-79.166952ÿÿÿÿÿÿÿÿÿÿProbÿ>ÿFÿÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.0411

------------------------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿoutÿ|ÿCoefficientÿÿStd.ÿerr.ÿÿÿÿÿÿtÿÿÿÿP>|t|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
-------------+----------------------------------------------------------------
ÿÿÿÿÿÿÿ1.timÿ|ÿÿÿ.6106461ÿÿÿ.2822054ÿÿÿÿÿ2.16ÿÿÿ0.041ÿÿÿÿÿ.0268597ÿÿÿÿ1.194432
ÿÿÿÿÿÿÿ_consÿ|ÿÿ-.4232431ÿÿÿ.2743544ÿÿÿÿ-1.54ÿÿÿ0.131ÿÿÿÿ-.9788147ÿÿÿÿ.1323285
------------------------------------------------------------------------------

------------------------------------------------------------------------------
ÿÿRandom-effectsÿparametersÿÿ|ÿÿÿEstimateÿÿÿStd.ÿerr.ÿÿÿÿÿ[95%ÿconf.ÿinterval]
-----------------------------+------------------------------------------------
pid:ÿIdentityÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(_cons)ÿ|ÿÿÿ.8508096ÿÿÿ.4163651ÿÿÿÿÿÿ.3260435ÿÿÿÿ2.220186
-----------------------------+------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(Residual)ÿ|ÿÿÿ.9556786ÿÿÿ.2818142ÿÿÿÿÿÿ.5361746ÿÿÿÿ1.703403
------------------------------------------------------------------------------

.ÿ
.ÿxtregÿoutÿi.(timÿsexÿses)ÿc.bmi,ÿi(pid)ÿfe
note:ÿ1.sexÿomittedÿbecauseÿofÿcollinearity.
note:ÿ1.sesÿomittedÿbecauseÿofÿcollinearity.
note:ÿ2.sesÿomittedÿbecauseÿofÿcollinearity.
note:ÿbmiÿomittedÿbecauseÿofÿcollinearity.

Fixed-effectsÿ(within)ÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿÿ48
Groupÿvariable:ÿpidÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿgroupsÿÿ=ÿÿÿÿÿÿÿÿÿ24

R-squared:ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿObsÿperÿgroup:
ÿÿÿÿÿWithinÿÿ=ÿ0.1691ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿminÿ=ÿÿÿÿÿÿÿÿÿÿ2
ÿÿÿÿÿBetweenÿ=ÿÿÿÿÿÿ.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿÿÿÿÿÿ2.0
ÿÿÿÿÿOverallÿ=ÿ0.0511ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿÿÿÿÿÿÿÿ2

ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿF(1,23)ÿÿÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿÿÿ4.68
corr(u_i,ÿXb)ÿ=ÿ0.0000ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿFÿÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.0411

------------------------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿoutÿ|ÿCoefficientÿÿStd.ÿerr.ÿÿÿÿÿÿtÿÿÿÿP>|t|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
-------------+----------------------------------------------------------------
ÿÿÿÿÿÿÿ1.timÿ|ÿÿÿ.6106461ÿÿÿ.2822054ÿÿÿÿÿ2.16ÿÿÿ0.041ÿÿÿÿÿ.0268597ÿÿÿÿ1.194432
ÿÿÿÿÿÿÿ1.sexÿ|ÿÿÿÿÿÿÿÿÿÿ0ÿÿ(omitted)
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿÿÿsesÿ|
ÿÿÿÿÿÿÿÿÿÿ1ÿÿ|ÿÿÿÿÿÿÿÿÿÿ0ÿÿ(omitted)
ÿÿÿÿÿÿÿÿÿÿ2ÿÿ|ÿÿÿÿÿÿÿÿÿÿ0ÿÿ(omitted)
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿÿÿbmiÿ|ÿÿÿÿÿÿÿÿÿÿ0ÿÿ(omitted)
ÿÿÿÿÿÿÿ_consÿ|ÿÿ-.4232431ÿÿÿ.1995493ÿÿÿÿ-2.12ÿÿÿ0.045ÿÿÿÿ-.8360424ÿÿÿ-.0104438
-------------+----------------------------------------------------------------
ÿÿÿÿÿsigma_uÿ|ÿÿ1.1526703
ÿÿÿÿÿsigma_eÿ|ÿÿ.97758817
ÿÿÿÿÿÿÿÿÿrhoÿ|ÿÿ.58163678ÿÿÿ(fractionÿofÿvarianceÿdueÿtoÿu_i)
------------------------------------------------------------------------------
Fÿtestÿthatÿallÿu_i=0:ÿF(23,ÿ23)ÿ=ÿ2.78ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿFÿ=ÿ0.0087

.ÿ
.ÿexit

endÿofÿdo-file

.

Classically, it's like the repeated-measures ANOVA model, where the within-subjects factor is unaffected by the between-subjects (time-invariant) factors. You can see that belwo: the test result (F instead of t, but same p-value) is unaffected by (isolated from) the between-subjects factors.

.ÿanovaÿoutÿsexÿsesÿc.bmiÿ/ÿpidÿtim

ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿ=ÿÿÿÿÿÿÿÿÿ48ÿÿÿÿR-squaredÿÿÿÿÿ=ÿÿ0.7490
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿRootÿMSEÿÿÿÿÿÿ=ÿÿÿÿ.977588ÿÿÿÿAdjÿR-squaredÿ=ÿÿ0.4871

ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿSourceÿ|ÿPartialÿSSÿÿÿÿÿÿÿÿÿdfÿÿÿÿÿÿÿÿÿMSÿÿÿÿÿÿÿÿFÿÿÿÿProb>F
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿ-----------+----------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿModelÿ|ÿÿ65.592513ÿÿÿÿÿÿÿÿÿ24ÿÿÿ2.7330214ÿÿÿÿÿÿ2.86ÿÿ0.0071
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿsexÿ|ÿÿ3.8420659ÿÿÿÿÿÿÿÿÿÿ1ÿÿÿ3.8420659ÿÿÿÿÿÿ1.20ÿÿ0.2868
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿsesÿ|ÿÿ4.6017183ÿÿÿÿÿÿÿÿÿÿ2ÿÿÿ2.3008591ÿÿÿÿÿÿ0.72ÿÿ0.5000
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿbmiÿ|ÿÿ2.0267605ÿÿÿÿÿÿÿÿÿÿ1ÿÿÿ2.0267605ÿÿÿÿÿÿ0.63ÿÿ0.4359
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿpidÿ|ÿÿÿ60.78961ÿÿÿÿÿÿÿÿÿ19ÿÿÿ3.1994531ÿÿ
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿ-----------+----------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿtimÿ|ÿÿ4.4746635ÿÿÿÿÿÿÿÿÿÿ1ÿÿÿ4.4746635ÿÿÿÿÿÿ4.68ÿÿ0.0411
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿResidualÿ|ÿÿ21.980609ÿÿÿÿÿÿÿÿÿ23ÿÿÿ.95567863ÿÿ
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿ-----------+----------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿTotalÿ|ÿÿ87.573122ÿÿÿÿÿÿÿÿÿ47ÿÿÿ1.8632579ÿÿ

.
3 likes
Comment
amanda watson

Join Date: Oct 2022

Posts: 10
#6

24 Oct 2022, 17:05

Ah, I see. Is there a way around this? Surely there is a way to adjust for confounding variables in a within subject study design?
1 like
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30085
#7

24 Oct 2022, 18:42

Surely there is a way to adjust for confounding variables in a within subject study design?

What Joseph Coveney didn't make clear in his excellent explanation is that when you have a variable that is time-invariant, the random intercept itself carries the information of the time-invariant variable with it, so that the result for the within-subject factor, in fact, is automatically adjusted for that time-invariant variable. You don't need to explicitly include it in the model to adjust for it: you get it "for free" with the random intercept. What you can't do in a within-subjects design is get an estimate of the effects of interest that is unadjusted for the time-invariant variables.
2 likes
Comment
amanda watson

Join Date: Oct 2022

Posts: 10
#8

24 Oct 2022, 19:08

OK, is it possible then to compare the differences (e.g., across SES groups), controlling for the other factors (eg sex) so to get the differences in behaviour that are only due to the factor in question
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30085
#9

25 Oct 2022, 09:24

Yes, the random effects model you ran does this. The results you are getting in your random-effects analysis for tim are adjusted (not controlled--nothing is ever controlled in observational studies) for sex and all other time variant factors. The results you are getting for sex and ses are comparisons across those groups, adjusted for the other factors.
2 likes
Comment
amanda watson

Join Date: Oct 2022

Posts: 10
#10

25 Oct 2022, 17:42

Originally posted by Clyde Schechter View Post

Yes, the random effects model you ran does this. The results you are getting in your random-effects analysis for tim are adjusted (not controlled--nothing is ever controlled in observational studies) for sex and all other time variant factors. The results you are getting for sex and ses are comparisons across those groups, adjusted for the other factors.

Thanks Clyde, can I clarify that by adding ses, sex and bmi to the model (below), the output would be saying that:
there is a difference in sleep between holidays and school for the whole group

there is no difference in sleep between holidays and school for females, compared to males;

there is a difference in sleep between holidays and school for middle and high ses groups, compared to low ses groups; and

there is no difference in sleep between holidays and school for overweight/obese compared to normal weight kids

. mixed genesleep i.time_new i.sex_new i.sestertile i.t1_bmi_cat_new || wave: || schoolid: || id:

Performing EM optimization ...

Performing gradient-based optimization:
Iteration 0: log likelihood = -1348.4249
Iteration 1: log likelihood = -1348.3657
Iteration 2: log likelihood = -1348.3639
Iteration 3: log likelihood = -1348.3639

Computing standard errors ...

Mixed-effects ML regression Number of obs = 276

Grouping information
-------------------------------------------------------------
| No. of Observations per group
Group variable | groups Minimum Average Maximum
----------------+--------------------------------------------
wave | 2 112 138.0 164
schoolid | 20 2 13.8 58
id | 138 2 2.0 2
-------------------------------------------------------------

Wald chi2(5) = 28.03
Log likelihood = -1348.3639 Prob > chi2 = 0.0000

-----------------------------------------------------------------------------------
genesleep | Coefficient Std. err. z P>|z| [95% conf. interval]
------------------+----------------------------------------------------------------
time_new |
holiday | -11.61219 3.041271 -3.82 0.000 -17.57298 -5.651413
|
sex_new |
female | 4.202483 5.04274 0.83 0.405 -5.681106 14.08607
|
sestertile |
mid | 22.77335 6.599729 3.45 0.001 9.838118 35.70858
high | 16.08677 6.750251 2.38 0.017 2.856519 29.31702
|
t1_bmi_cat_new |
overweight/obese | -2.261163 5.883788 -0.38 0.701 -13.79317 9.270849
_cons | 546.7617 6.907415 79.16 0.000 533.2234 560.3
-----------------------------------------------------------------------------------

------------------------------------------------------------------------------
Random-effects parameters | Estimate Std. err. [95% conf. interval]
-----------------------------+------------------------------------------------
wave: Identity |
var(_cons) | 2.10e-07 .0001573 0 .
-----------------------------+------------------------------------------------
schoolid: Identity |
var(_cons) | 88.69569 62.78283 22.1505 355.1579
-----------------------------+------------------------------------------------
id: Identity |
var(_cons) | 444.9129 104.2474 281.0801 704.2385
-----------------------------+------------------------------------------------
var(Residual) | 638.2037 76.83141 504.0641 808.0401
------------------------------------------------------------------------------
LR test vs. linear model: chi2(3) = 39.66 Prob > chi2 = 0.0000
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30085
#11

25 Oct 2022, 18:43

Well, I actually disagree with all of those conclusions as stated, but in the sense that you have been (mis)taught to understand them, they are correct.

The notion that a statistically significant effect means "there is a difference" and a non-statistically significant effect means "there is no difference" is probably the most widespread fallacy in all of statistics. It is passed on from generation to generation because it is aggressively taught in many basic statistics classes. It is as wrong as wrong can be. This fallacy is one of the reasons that the leadership of the American Statistical Association has recommended that the very use of the concept of statistical significance be abandoned. See https://www.tandfonline.com/doi/full...5.2019.1583913 for the "executive summary" and https://www.tandfonline.com/toc/utas20/73/sup1 for all 43 supporting articles. Or https://www.nature.com/articles/d41586-019-00857-9 for the tl;dr.

As with the misleading term "control variable," I do not imagine that I, even with the support of so many prominent statisticians, will uproot this fallacy any time soon. But I will call attention to it, and avoid it in my own work.

What would be a correct interpretation of the findings? That can best be seen by giving up the pretense that we can actually classify effects as "existent" and "nonexistent." Rather we can interpret regression coefficients and their confidence intervals as providing us with an estimate of how big those effects are. The coefficient is a single "best" estimate and the confidence interval provides a range of possible values that are largely consistent with the data and the model. So for example, the non-statistically significant effect of sex is better understood as: the best estimate of the (adjusted) mean genesleep in women is about 4.2 greater in women than in men, but the data are compatible with the difference being somewhere between 5.7 less in women and 14.1 higher. As I do not know what the variable genesleep represents, it is hard for me to say anything more about it. I do not know if differences of these magnitudes are too small to be meaningful in the real world, or not. If they are large enough to be worth talking about, then it is fair to notice that the confidence interval covers territory ranging from meaningfully greater in women to meaningfully greater in men. The shortest fair way to summarize that would be to say that the study is inconclusive with regard to the direction of the effect.

By contrast, the expected difference in genesleep between mid and low ses is about 22.8 higher in the mid ses group, and the data are compatible with it being between 9.8 and 35.7. If a magnitude of 4 to 5 was meaningful (as I supposed for sake of illustration in regard to the sex effect), then the data are clearly compatible only with differences that are meaningfully large and in favor of the mid ses group. These results are conclusive about the direction of the difference and clearly mark it as large enough to be worth talking about. (Again, my use of 4 or 5 as the definition of meaningfully large is made up: you need to choose your own criterion for a difference in genesleep that has meaning in the real world. You know what genesleep is.)

As I say, if you are going to adhere to the discredited paradigm of statistically significant = effect, non-statistically significant = no effect, then your original conclusions are, within that framework, correct. But I hope you will think about abandoning that approach and viewing things more in terms of effect sizes compatible with the data and how that range relates to effect sizes that are meaningfully large in the real world.
2 likes
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4406
#12

26 Oct 2022, 01:34

Originally posted by amanda watson View Post

. . . by adding ses, sex and bmi to the model (below), the output would be saying that:

2. there is no difference in sleep between holidays and school for females, compared to males;
3. there is a difference in sleep between holidays and school for middle and high ses groups, compared to low ses groups; and
4. there is no difference in sleep between holidays and school for overweight/obese compared to normal weight kids

No. Your model does not address any of those questions.

In order to do that, you need to examine the interactions of time and sex, SES and BMI.

But first, get rid of the wave: from the random effects equation. There are only two levels of that category and its variance component has collapsed to zero. If you want your model to accommodate wave or adjust for it, then include it in the fixed effects equation using factor variables notation.

Your second level, school, is also rather sparse and its variance is not estimated very precisely. But it probably does no harm to leave it in the random effects side.

So, your model would look more like the following.

Code:

// For legibility rename time_new tim rename sex_new Sex // assuming sex is already present in dataset rename sestertile SES // assuming ses is already present in dataset rename t1_bmi_cat_new BMI // assuming bmi is already present in dataset mixed genesleep i.Sex##i.tim i.SES##i.tim i.BMI##i.tim i.wave || schoolid: || id: testparm SES#tim

All subject to Clyde's very cogent proviso.
1 like
Comment

Announcement

Stata's mixed command*- adjusting for confounding variables

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment