Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Paired, clustered t-test?

    Dear users

    I am looking at a longitudinal dataset with 2 waves of data (baseline and endpoint). The data is collected from entrants to employability programmes. It is a randomised controlled trial: entrants are randomly allocated either to programme type A or B. Programmes A and B are rolled out at sites, or clusters, of which there are 44. Random assignment is therefore to a cluster which is running either programme A or B.

    I am trying to assess change in outcomes between wave 1 and 2. For continuous outcomes the simplest option would be to run paired t-tests. However this ignores the clustered nature of the data and leads to standard errors and p-values that are too small (on some variables the ICC is 'significantly' higher than zero). There are two types of analyses I am trying to conduct:
    1) Difference between outcome at wave 1 and 2
    2) Difference between waves 1 and 2 by a binary group variable - eg whether the respondent is in a treatment or control group.

    The best I can come up with is:

    For (1):
    . clttest outcome, cluster (cluster] by (wave) [in long format]
    OR
    .ttest w1_outcome==w2_outcome. [in wide format]

    So I'm either ignoring the dependence of the observations, or the clustering. User-written clttest seems to not support paired data.

    For (2):
    Generate a difference variable (ie w2_outcome-w1_outcome), and run clttest, eg:
    .clttest delta_outcome, cluster (cluster) by (treatment)

    In this case I think both the clustering and the pairing of observations have been dealt with but I'm not 100% sure.

    Any guidance at all would be greatly appreciated.

    Thanks,
    Zoheb

  • #2
    Zoheb:
    why not considering -xtreg- with clustered standard errors among your regression strategies?
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hi Carlo

      Thank you - the more I read about this the more it seems I will need to use regression. I just like the simplicity of t-tests for routine and exploratory basic analyses. So I'm hoping it's possible somehow.

      Zoheb

      Comment


      • #4
        I believe that you can get what you want with a combination of mixed and margins , contrast to get the differences that your seeking. The mixed model would have hierarchical random effects of entrant nested under cluster (site).

        Consider something like that below. (Begin at the "Begin here" comment. First the hierarchical model is fitted, and then each of your two types of analysis is illustrated.)

        .ÿversionÿ14.2

        .ÿ
        .ÿclear*

        .ÿsetÿmoreÿoff

        .ÿsetÿseedÿ1360606

        .ÿ
        .ÿquietlyÿsetÿobsÿ44

        .ÿgenerateÿbyteÿclusterÿ=ÿ_n

        .ÿgenerateÿdoubleÿuÿ=ÿrnormal()

        .ÿ
        .ÿgenerateÿbyteÿprogramÿ=ÿmod(_n,ÿ2)

        .ÿlabelÿdefineÿProgramsÿ0ÿAÿ1ÿB

        .ÿlabelÿvaluesÿprogramÿPrograms

        .ÿ
        .ÿquietlyÿexpandÿ50

        .ÿgenerateÿintÿentrantÿ=ÿ_n

        .ÿ
        .ÿdrawnormÿoutcome0ÿoutcome1,ÿdoubleÿcorr(1ÿ0.5ÿ\ÿ0.5ÿ1)

        .ÿquietlyÿreshapeÿlongÿoutcome,ÿi(entrant)ÿj(wave)

        .ÿlabelÿdefineÿWavesÿ0ÿBaselineÿ1ÿEndpoint

        .ÿlabelÿvaluesÿwaveÿWaves

        .ÿ
        .ÿquietlyÿreplaceÿoutcomeÿ=ÿ(outcomeÿ+ÿu)ÿ/ÿsqrt(2)

        .ÿ
        .ÿ*
        .ÿ*ÿBeginÿhere
        .ÿ*
        .ÿmixedÿoutcomeÿi.program##i.waveÿ||ÿcluster:ÿ||ÿentrant:ÿ,ÿremlÿnolrtestÿnolog

        Mixed-effectsÿREMLÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿ4,400

        -------------------------------------------------------------
        ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿNo.ÿofÿÿÿÿÿÿÿObservationsÿperÿGroup
        ÿGroupÿVariableÿ|ÿÿÿÿÿGroupsÿÿÿÿMinimumÿÿÿÿAverageÿÿÿÿMaximum
        ----------------+--------------------------------------------
        ÿÿÿÿÿÿÿÿclusterÿ|ÿÿÿÿÿÿÿÿÿ44ÿÿÿÿÿÿÿÿ100ÿÿÿÿÿÿ100.0ÿÿÿÿÿÿÿÿ100
        ÿÿÿÿÿÿÿÿentrantÿ|ÿÿÿÿÿÿ2,200ÿÿÿÿÿÿÿÿÿÿ2ÿÿÿÿÿÿÿÿ2.0ÿÿÿÿÿÿÿÿÿÿ2
        -------------------------------------------------------------

        ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿWaldÿchi2(3)ÿÿÿÿÿÿ=ÿÿÿÿÿÿÿ1.54
        Logÿrestricted-likelihoodÿ=ÿ-4495.8472ÿÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.6723

        ------------------------------------------------------------------------------
        ÿÿÿÿÿoutcomeÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
        -------------+----------------------------------------------------------------
        ÿÿÿÿÿprogramÿ|
        ÿÿÿÿÿÿÿÿÿÿBÿÿ|ÿÿÿ.2159406ÿÿÿÿ.233191ÿÿÿÿÿ0.93ÿÿÿ0.354ÿÿÿÿ-.2411053ÿÿÿÿ.6729865
        ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
        ÿÿÿÿÿÿÿÿwaveÿ|
        ÿÿÿEndpointÿÿ|ÿÿ-.0098412ÿÿÿ.0213836ÿÿÿÿ-0.46ÿÿÿ0.645ÿÿÿÿ-.0517523ÿÿÿÿ.0320699
        ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
        program#waveÿ|
        ÿB#Endpointÿÿ|ÿÿÿ.0229581ÿÿÿÿ.030241ÿÿÿÿÿ0.76ÿÿÿ0.448ÿÿÿÿ-.0363132ÿÿÿÿ.0822293
        ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
        ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ.0095007ÿÿÿ.1648909ÿÿÿÿÿ0.06ÿÿÿ0.954ÿÿÿÿ-.3136796ÿÿÿÿÿ.332681
        ------------------------------------------------------------------------------

        ------------------------------------------------------------------------------
        ÿÿRandom-effectsÿParametersÿÿ|ÿÿÿEstimateÿÿÿStd.ÿErr.ÿÿÿÿÿ[95%ÿConf.ÿInterval]
        -----------------------------+------------------------------------------------
        cluster:ÿIdentityÿÿÿÿÿÿÿÿÿÿÿÿ|
        ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(_cons)ÿ|ÿÿÿ.5882437ÿÿÿ.1299805ÿÿÿÿÿÿ.3814808ÿÿÿÿ.9070724
        -----------------------------+------------------------------------------------
        entrant:ÿIdentityÿÿÿÿÿÿÿÿÿÿÿÿ|
        ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(_cons)ÿ|ÿÿÿ.2442358ÿÿÿ.0118899ÿÿÿÿÿÿ.2220092ÿÿÿÿ.2686875
        -----------------------------+------------------------------------------------
        ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(Residual)ÿ|ÿÿÿ.2514923ÿÿÿ.0075862ÿÿÿÿÿÿ.2370546ÿÿÿÿ.2668094
        ------------------------------------------------------------------------------

        .ÿ
        .ÿ//ÿ1)ÿDifferenceÿbetweenÿoutcomeÿatÿwaveÿ1ÿandÿ2
        .ÿmarginsÿwave,ÿcontrast(pveffects)

        Contrastsÿofÿpredictiveÿmargins

        Expressionÿÿÿ:ÿLinearÿprediction,ÿfixedÿportion,ÿpredict()

        ------------------------------------------------
        ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿdfÿÿÿÿÿÿÿÿchi2ÿÿÿÿÿP>chi2
        -------------+----------------------------------
        ÿÿÿÿÿÿÿÿwaveÿ|ÿÿÿÿÿÿÿÿÿÿ1ÿÿÿÿÿÿÿÿ0.01ÿÿÿÿÿ0.9137
        ------------------------------------------------

        ------------------------------------------------------------
        ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿÿÿÿDelta-method
        ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿContrastÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|
        --------------------+---------------------------------------
        ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿwaveÿ|
        (Endpointÿvsÿbase)ÿÿ|ÿÿÿ.0016379ÿÿÿ.0151205ÿÿÿÿÿ0.11ÿÿÿ0.914
        ------------------------------------------------------------

        .ÿÿ
        .ÿ//ÿ2)ÿDifferenceÿbetweenÿwavesÿ1ÿandÿ2ÿbyÿ.ÿ.ÿ.ÿwhetherÿtheÿrespondentÿisÿinÿaÿtreatmentÿorÿcontrolÿgroup
        .ÿmarginsÿprogram#wave,ÿcontrast(pveffects)

        Contrastsÿofÿadjustedÿpredictions

        Expressionÿÿÿ:ÿLinearÿprediction,ÿfixedÿportion,ÿpredict()

        ------------------------------------------------
        ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿdfÿÿÿÿÿÿÿÿchi2ÿÿÿÿÿP>chi2
        -------------+----------------------------------
        program#waveÿ|ÿÿÿÿÿÿÿÿÿÿ1ÿÿÿÿÿÿÿÿ0.58ÿÿÿÿÿ0.4478
        ------------------------------------------------

        ------------------------------------------------------------------------
        ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿÿÿÿDelta-method
        ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿContrastÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|
        --------------------------------+---------------------------------------
        ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿprogram#waveÿ|
        (Bÿvsÿbase)ÿ(Endpointÿvsÿbase)ÿÿ|ÿÿÿ.0229581ÿÿÿÿ.030241ÿÿÿÿÿ0.76ÿÿÿ0.448
        ------------------------------------------------------------------------

        .ÿÿ*ÿor
        .ÿlincomÿ_b[1.program#1.wave]

        ÿ(ÿ1)ÿÿ[outcome]1.program#1.waveÿ=ÿ0

        ------------------------------------------------------------------------------
        ÿÿÿÿÿoutcomeÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
        -------------+----------------------------------------------------------------
        ÿÿÿÿÿÿÿÿÿ(1)ÿ|ÿÿÿ.0229581ÿÿÿÿ.030241ÿÿÿÿÿ0.76ÿÿÿ0.448ÿÿÿÿ-.0363132ÿÿÿÿ.0822293
        ------------------------------------------------------------------------------

        .ÿ
        .ÿexit

        endÿofÿdo-file


        .

        Comment


        • #5
          Thanks Joseph, this is very helpful. I've run the code and got an error with the rem1 option (option not allowed). I don't know what the difference is between ML and restricted ML so ran the code without the option (ie a standard mixed effects ML regression) and the output seems to make sense. I like the structure of this approach so will definitely look into it more closely.

          Thanks again,
          Zoheb

          Comment


          • #6
            I'm a bit unclear about what is wanted here; what is the problem with -clttest-? if you want other variables in the model, recall that the paired t-test is the same as a regression with the difference as the outcome (dependent) variable and no predictors/covariates/independent variables other than the constant - then you can use regression's "vce(cluster clusterid)" option

            Comment


            • #7
              Thanks Rich. The advice re regression is noted.

              In the first clttest example I provided I thought using the wave variable was insufficient to deal with dependent observations over time - ie observations aren't paired, so are treated in the same way as repeated cross-sections rather than strictly panel data.

              Comment

              Working...
              X