Paired, clustered t-test?

Zoheb Khan

Join Date: Jul 2015

Posts: 23
#1

Paired, clustered t-test?

18 Oct 2016, 01:55

Dear users

I am looking at a longitudinal dataset with 2 waves of data (baseline and endpoint). The data is collected from entrants to employability programmes. It is a randomised controlled trial: entrants are randomly allocated either to programme type A or B. Programmes A and B are rolled out at sites, or clusters, of which there are 44. Random assignment is therefore to a cluster which is running either programme A or B.

I am trying to assess change in outcomes between wave 1 and 2. For continuous outcomes the simplest option would be to run paired t-tests. However this ignores the clustered nature of the data and leads to standard errors and p-values that are too small (on some variables the ICC is 'significantly' higher than zero). There are two types of analyses I am trying to conduct:
1) Difference between outcome at wave 1 and 2
2) Difference between waves 1 and 2 by a binary group variable - eg whether the respondent is in a treatment or control group.

The best I can come up with is:

For (1):
. clttest outcome, cluster (cluster] by (wave) [in long format]
OR
.ttest w1_outcome==w2_outcome. [in wide format]

So I'm either ignoring the dependence of the observations, or the clustering. User-written clttest seems to not support paired data.

For (2):
Generate a difference variable (ie w2_outcome-w1_outcome), and run clttest, eg:
.clttest delta_outcome, cluster (cluster) by (treatment)

In this case I think both the clustering and the pairing of observations have been dealt with but I'm not 100% sure.

Any guidance at all would be greatly appreciated.

Thanks,
Zoheb
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17699
#2

18 Oct 2016, 05:12

Zoheb:
why not considering -xtreg- with clustered standard errors among your regression strategies?

Kind regards,
Carlo
(Stata 19.0)
Comment
Zoheb Khan

Join Date: Jul 2015

Posts: 23
#3

18 Oct 2016, 05:40

Hi Carlo

Thank you - the more I read about this the more it seems I will need to use regression. I just like the simplicity of t-tests for routine and exploratory basic analyses. So I'm hoping it's possible somehow.

Zoheb
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4396
#4

18 Oct 2016, 06:00

I believe that you can get what you want with a combination of mixed and margins , contrast to get the differences that your seeking. The mixed model would have hierarchical random effects of entrant nested under cluster (site).

Consider something like that below. (Begin at the "Begin here" comment. First the hierarchical model is fitted, and then each of your two types of analysis is illustrated.)

.ÿversionÿ14.2

.ÿ
.ÿclear*

.ÿsetÿmoreÿoff

.ÿsetÿseedÿ1360606

.ÿ
.ÿquietlyÿsetÿobsÿ44

.ÿgenerateÿbyteÿclusterÿ=ÿ_n

.ÿgenerateÿdoubleÿuÿ=ÿrnormal()

.ÿ
.ÿgenerateÿbyteÿprogramÿ=ÿmod(_n,ÿ2)

.ÿlabelÿdefineÿProgramsÿ0ÿAÿ1ÿB

.ÿlabelÿvaluesÿprogramÿPrograms

.ÿ
.ÿquietlyÿexpandÿ50

.ÿgenerateÿintÿentrantÿ=ÿ_n

.ÿ
.ÿdrawnormÿoutcome0ÿoutcome1,ÿdoubleÿcorr(1ÿ0.5ÿ\ÿ0.5ÿ1)

.ÿquietlyÿreshapeÿlongÿoutcome,ÿi(entrant)ÿj(wave)

.ÿlabelÿdefineÿWavesÿ0ÿBaselineÿ1ÿEndpoint

.ÿlabelÿvaluesÿwaveÿWaves

.ÿ
.ÿquietlyÿreplaceÿoutcomeÿ=ÿ(outcomeÿ+ÿu)ÿ/ÿsqrt(2)

.ÿ
.ÿ*
.ÿ*ÿBeginÿhere
.ÿ*
.ÿmixedÿoutcomeÿi.program##i.waveÿ||ÿcluster:ÿ||ÿentrant:ÿ,ÿremlÿnolrtestÿnolog

Mixed-effectsÿREMLÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿ4,400

-------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿNo.ÿofÿÿÿÿÿÿÿObservationsÿperÿGroup
ÿGroupÿVariableÿ|ÿÿÿÿÿGroupsÿÿÿÿMinimumÿÿÿÿAverageÿÿÿÿMaximum
----------------+--------------------------------------------
ÿÿÿÿÿÿÿÿclusterÿ|ÿÿÿÿÿÿÿÿÿ44ÿÿÿÿÿÿÿÿ100ÿÿÿÿÿÿ100.0ÿÿÿÿÿÿÿÿ100
ÿÿÿÿÿÿÿÿentrantÿ|ÿÿÿÿÿÿ2,200ÿÿÿÿÿÿÿÿÿÿ2ÿÿÿÿÿÿÿÿ2.0ÿÿÿÿÿÿÿÿÿÿ2
-------------------------------------------------------------

ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿWaldÿchi2(3)ÿÿÿÿÿÿ=ÿÿÿÿÿÿÿ1.54
Logÿrestricted-likelihoodÿ=ÿ-4495.8472ÿÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.6723

------------------------------------------------------------------------------
ÿÿÿÿÿoutcomeÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
-------------+----------------------------------------------------------------
ÿÿÿÿÿprogramÿ|
ÿÿÿÿÿÿÿÿÿÿBÿÿ|ÿÿÿ.2159406ÿÿÿÿ.233191ÿÿÿÿÿ0.93ÿÿÿ0.354ÿÿÿÿ-.2411053ÿÿÿÿ.6729865
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿÿwaveÿ|
ÿÿÿEndpointÿÿ|ÿÿ-.0098412ÿÿÿ.0213836ÿÿÿÿ-0.46ÿÿÿ0.645ÿÿÿÿ-.0517523ÿÿÿÿ.0320699
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
program#waveÿ|
ÿB#Endpointÿÿ|ÿÿÿ.0229581ÿÿÿÿ.030241ÿÿÿÿÿ0.76ÿÿÿ0.448ÿÿÿÿ-.0363132ÿÿÿÿ.0822293
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ.0095007ÿÿÿ.1648909ÿÿÿÿÿ0.06ÿÿÿ0.954ÿÿÿÿ-.3136796ÿÿÿÿÿ.332681
------------------------------------------------------------------------------

------------------------------------------------------------------------------
ÿÿRandom-effectsÿParametersÿÿ|ÿÿÿEstimateÿÿÿStd.ÿErr.ÿÿÿÿÿ[95%ÿConf.ÿInterval]
-----------------------------+------------------------------------------------
cluster:ÿIdentityÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(_cons)ÿ|ÿÿÿ.5882437ÿÿÿ.1299805ÿÿÿÿÿÿ.3814808ÿÿÿÿ.9070724
-----------------------------+------------------------------------------------
entrant:ÿIdentityÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(_cons)ÿ|ÿÿÿ.2442358ÿÿÿ.0118899ÿÿÿÿÿÿ.2220092ÿÿÿÿ.2686875
-----------------------------+------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(Residual)ÿ|ÿÿÿ.2514923ÿÿÿ.0075862ÿÿÿÿÿÿ.2370546ÿÿÿÿ.2668094
------------------------------------------------------------------------------

.ÿ
.ÿ//ÿ1)ÿDifferenceÿbetweenÿoutcomeÿatÿwaveÿ1ÿandÿ2
.ÿmarginsÿwave,ÿcontrast(pveffects)

Contrastsÿofÿpredictiveÿmargins

Expressionÿÿÿ:ÿLinearÿprediction,ÿfixedÿportion,ÿpredict()

------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿdfÿÿÿÿÿÿÿÿchi2ÿÿÿÿÿP>chi2
-------------+----------------------------------
ÿÿÿÿÿÿÿÿwaveÿ|ÿÿÿÿÿÿÿÿÿÿ1ÿÿÿÿÿÿÿÿ0.01ÿÿÿÿÿ0.9137
------------------------------------------------

------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿÿÿÿDelta-method
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿContrastÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|
--------------------+---------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿwaveÿ|
(Endpointÿvsÿbase)ÿÿ|ÿÿÿ.0016379ÿÿÿ.0151205ÿÿÿÿÿ0.11ÿÿÿ0.914
------------------------------------------------------------

.ÿÿ
.ÿ//ÿ2)ÿDifferenceÿbetweenÿwavesÿ1ÿandÿ2ÿbyÿ.ÿ.ÿ.ÿwhetherÿtheÿrespondentÿisÿinÿaÿtreatmentÿorÿcontrolÿgroup
.ÿmarginsÿprogram#wave,ÿcontrast(pveffects)

Contrastsÿofÿadjustedÿpredictions

Expressionÿÿÿ:ÿLinearÿprediction,ÿfixedÿportion,ÿpredict()

------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿdfÿÿÿÿÿÿÿÿchi2ÿÿÿÿÿP>chi2
-------------+----------------------------------
program#waveÿ|ÿÿÿÿÿÿÿÿÿÿ1ÿÿÿÿÿÿÿÿ0.58ÿÿÿÿÿ0.4478
------------------------------------------------

------------------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿÿÿÿDelta-method
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿContrastÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|
--------------------------------+---------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿprogram#waveÿ|
(Bÿvsÿbase)ÿ(Endpointÿvsÿbase)ÿÿ|ÿÿÿ.0229581ÿÿÿÿ.030241ÿÿÿÿÿ0.76ÿÿÿ0.448
------------------------------------------------------------------------

.ÿÿ*ÿor
.ÿlincomÿ_b[1.program#1.wave]

ÿ(ÿ1)ÿÿ[outcome]1.program#1.waveÿ=ÿ0

------------------------------------------------------------------------------
ÿÿÿÿÿoutcomeÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
-------------+----------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿ(1)ÿ|ÿÿÿ.0229581ÿÿÿÿ.030241ÿÿÿÿÿ0.76ÿÿÿ0.448ÿÿÿÿ-.0363132ÿÿÿÿ.0822293
------------------------------------------------------------------------------

.ÿ
.ÿexit

endÿofÿdo-file

.
1 like
Comment
Zoheb Khan

Join Date: Jul 2015

Posts: 23
#5

18 Oct 2016, 06:46

Thanks Joseph, this is very helpful. I've run the code and got an error with the rem1 option (option not allowed). I don't know what the difference is between ML and restricted ML so ran the code without the option (ie a standard mixed effects ML regression) and the output seems to make sense. I like the structure of this approach so will definitely look into it more closely.

Thanks again,
Zoheb
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4455
#6

18 Oct 2016, 11:12

I'm a bit unclear about what is wanted here; what is the problem with -clttest-? if you want other variables in the model, recall that the paired t-test is the same as a regression with the difference as the outcome (dependent) variable and no predictors/covariates/independent variables other than the constant - then you can use regression's "vce(cluster clusterid)" option
Comment
Zoheb Khan

Join Date: Jul 2015

Posts: 23
#7

19 Oct 2016, 07:02

Thanks Rich. The advice re regression is noted.

In the first clttest example I provided I thought using the wave variable was insufficient to deal with dependent observations over time - ie observations aren't paired, so are treated in the same way as repeated cross-sections rather than strictly panel data.
Comment

Announcement

Paired, clustered t-test?

Comment

Comment

Comment

Comment

Comment

Comment