Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Unpaired multivariable analysis model

    Dear all,

    I want to test the effectiveness of a educational videogame to increase protective factors of alcohol consumption in schools. For this purpose, we have recruited 8 schools which had to fill the same questionnaire on two different times (Q1 and Q2). Those schools were randomized to be assigned to 2 groups:
    • implement the videogame between Q1 and Q2 (intervention group)
    • Implement the videogame after Q2 (control group: there is no intervention between Q1 and Q2)
    Due to research issues because of recent european laws, we decided to make the surveys 100% anonymous (No match codes were used).

    I have a dataset of 20 variables and aproximately 800 observations (400 from questionnaire 1 and 400 from questionnaire 2).

    Some of the variables are dependent variables in several models, but I want only to consider one hypothetical analysis, so I will only use one dependent variable.

    Let's call my variables the following way:
    • dependent variable in Q1: y1
    • dependent variable in Q2: y2
    • independent variables in Q1: xa1 xb1 xc1
    • independent variables in Q2: xa2 xb2 xc2
    • intervention variable: intervention1 intervention2
    • cluster code: cluster
    To test the effectiveness of the intervention, I can run unpaired ttest, or cltest command, but we wish to adjust the analysis for potential confounders. We have searched for multivariable models of unpaired data, but we have found nothing for this issue.

    One easy solution to our problem would be to generate the "different from Q1 mean", and then run a linear regression to test if there is "difference between within group differences":

    Code:
    egen ymean1 = mean(y1)
    gen difference= y2 - ymean1
    
    regress difference intervention2 xa1 xb1 xc1 || cluster:
    My question is: can STATA run adjusted analysis of unpaired data by other ways than this?


    Thanks in advance

  • #2
    It was not clear to me: a) the number of questions by questionnaire; b) whether the questionnaires provide answers under Likert scales or otherwise; c) whether there is an overall result for the whole questionnaire; d) whether y1 and y2 may correlate with each other.

    Hazarding a guess, a SEM - or GSEM - model or a hierarchic "mixed" model may potentially accomplish the wishful task.
    Best regards,

    Marcos

    Comment


    • #3
      Pedro Antonio:
      I share Marcos' opinion about the (surely inadvertent) fogginess of your query.
      It is not clear whether pupils nested within classes that, in turn were nested within schools filled the questionaires during tw measurement issues (if that were the case, I would sponsor Marcos' advice about -mixed- model) or else.
      In sum, providing more details could help.
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Thanks for your respons. Here you have answers for your questions.

        Originally posted by Marcos Almeida View Post
        It was not clear to me: a) the number of questions by questionnaire; b) whether the questionnaires provide answers under Likert scales or otherwise; c) whether there is an overall result for the whole questionnaire; d) whether y1 and y2 may correlate with each other.

        Hazarding a guess, a SEM - or GSEM - model or a hierarchic "mixed" model may potentially accomplish the wishful task.
        The first questionnaire has 182 items distributed among 25 questions.

        The second questionnaire is similar, with more items to ask about the intervention.

        Originally posted by Marcos Almeida View Post
        b) whether the questionnaires provide answers under Likert scales or otherwise; c) whether there is an overall result for the whole questionnaire; d) whether y1 and y2 may correlate with each other.

        Hazarding a guess, a SEM - or GSEM - model or a hierarchic "mixed" model may potentially accomplish the wishful task.
        Most items are in a 7-point Likert-scale. Items are averaged to form several validated scales (i.e: Rosemberg self-esteem scale). These validated scales are used as both dependent and independent variables in our study. There are other questions about socialdemographic variables (sex, socioeconomical status, etc.).

        Originally posted by Marcos Almeida View Post
        c) whether there is an overall result for the whole questionnaire;
        There is no an overall result for the whole questionnaire. It is a multipurpose questionnaire. We have alcohol-related items, personality-items, leisure-items, etc. The intervention is hypothesized to change the personality items (i.e: asertivity, social skills, etc) and/or some alcohol-related scales (positive actitudes towards alcohol consumption, etc)

        Originally posted by Marcos Almeida View Post
        d) whether y1 and y2 may correlate with each other.
        Suppose we want to test the change in self-esteem after the intervention. In that case, y1 and y2 would be the averaged self-steem score from self-steem related items from Q1 and Q2, respectively.
        Since most students will do both the Q1 and Q2, I can assume y1 and y2 are correlated.

        Originally posted by Marcos Almeida View Post
        d) whether y1 and y2 may correlate with each other.

        Hazarding a guess, a SEM - or GSEM - model or a hierarchic "mixed" model may potentially accomplish the wishful task.
        I will think about using SEM models. Regarding the "mixed" model, should I made itl in a similar way of my posted regression model (by substracting the mean of Q1)?

        Comment


        • #5
          I tend to think that "big" questionnaires without dimensions as well as an overall measure can become a source of headaches...

          That being said, it seems you may use GSEM so as to check covariates, Yvars as wel as adjusting for correlation between them.

          Item response theory is strategy worth taking in consideration as well.
          Best regards,

          Marcos

          Comment


          • #6
            Originally posted by Carlo Lazzaro View Post
            Pedro Antonio:
            I share Marcos' opinion about the (surely inadvertent) fogginess of your query.
            It is not clear whether pupils nested within classes that, in turn were nested within schools filled the questionaires during tw measurement issues (if that were the case, I would sponsor Marcos' advice about -mixed- model) or else.
            In sum, providing more details could help.
            Pupils are nested only by schools. They filled the questionnaries in january and april, using a computer software. Match codes were not used, so I can not match data from Q1 and Q2 at individual level.


            Originally posted by Marcos Almeida View Post
            I tend to think that "big" questionnaires without dimensions as well as an overall measure can become a source of headaches...

            That being said, it seems you may use GSEM so as to check covariates, Yvars as wel as adjusting for correlation between them.

            Item response theory is strategy worth taking in consideration as well.
            Following your suggested GSEM route, I think It would be interesting to fit a basal model using the data from Q1.

            Then, I could replicate the model with Q2 data for both intervention group control group. I can compare the differences between both groups using invariance analysis. Control group coefficients should not change a lot.



            Comment

            Working...
            X