Unpaired multivariable analysis model

PedroAntonio delaRosa

Join Date: Feb 2019

Posts: 5
#1

Unpaired multivariable analysis model

19 Feb 2019, 05:36

Dear all,

I want to test the effectiveness of a educational videogame to increase protective factors of alcohol consumption in schools. For this purpose, we have recruited 8 schools which had to fill the same questionnaire on two different times (Q1 and Q2). Those schools were randomized to be assigned to 2 groups:
implement the videogame between Q1 and Q2 (intervention group)

Implement the videogame after Q2 (control group: there is no intervention between Q1 and Q2)

Due to research issues because of recent european laws, we decided to make the surveys 100% anonymous (No match codes were used).

I have a dataset of 20 variables and aproximately 800 observations (400 from questionnaire 1 and 400 from questionnaire 2).

Some of the variables are dependent variables in several models, but I want only to consider one hypothetical analysis, so I will only use one dependent variable.

Let's call my variables the following way:
dependent variable in Q1: y₁

dependent variable in Q2: y₂

independent variables in Q1: x_a1x_b1x_c1

independent variables in Q2: x_a2x_b2x_c2

intervention variable: intervention₁ intervention₂

cluster code: cluster

To test the effectiveness of the intervention, I can run unpaired ttest, or cltest command, but we wish to adjust the analysis for potential confounders. We have searched for multivariable models of unpaired data, but we have found nothing for this issue.

One easy solution to our problem would be to generate the "different from Q1 mean", and then run a linear regression to test if there is "difference between within group differences":

Code:

egen ymean1 = mean(y1) gen difference= y2 - ymean1 regress difference intervention₂ x_a1x_b1x_c1|| cluster:

My question is: can STATA run adjusted analysis of unpaired data by other ways than this?

Thanks in advance
Tags: None
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#2

19 Feb 2019, 06:30

It was not clear to me: a) the number of questions by questionnaire; b) whether the questionnaires provide answers under Likert scales or otherwise; c) whether there is an overall result for the whole questionnaire; d) whether y1 and y2 may correlate with each other.

Hazarding a guess, a SEM - or GSEM - model or a hierarchic "mixed" model may potentially accomplish the wishful task.

Best regards,

Marcos
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17739
#3

19 Feb 2019, 09:50

Pedro Antonio:
I share Marcos' opinion about the (surely inadvertent) fogginess of your query.
It is not clear whether pupils nested within classes that, in turn were nested within schools filled the questionaires during tw measurement issues (if that were the case, I would sponsor Marcos' advice about -mixed- model) or else.
In sum, providing more details could help.

Kind regards,
Carlo
(Stata 19.0)
Comment
PedroAntonio delaRosa

Join Date: Feb 2019

Posts: 5
#4

19 Feb 2019, 10:09

Thanks for your respons. Here you have answers for your questions.

Originally posted by Marcos Almeida View Post

It was not clear to me: a) the number of questions by questionnaire; b) whether the questionnaires provide answers under Likert scales or otherwise; c) whether there is an overall result for the whole questionnaire; d) whether y1 and y2 may correlate with each other.

Hazarding a guess, a SEM - or GSEM - model or a hierarchic "mixed" model may potentially accomplish the wishful task.

The first questionnaire has 182 items distributed among 25 questions.

The second questionnaire is similar, with more items to ask about the intervention.

Originally posted by Marcos Almeida View Post

b) whether the questionnaires provide answers under Likert scales or otherwise; c) whether there is an overall result for the whole questionnaire; d) whether y1 and y2 may correlate with each other.

Hazarding a guess, a SEM - or GSEM - model or a hierarchic "mixed" model may potentially accomplish the wishful task.

Most items are in a 7-point Likert-scale. Items are averaged to form several validated scales (i.e: Rosemberg self-esteem scale). These validated scales are used as both dependent and independent variables in our study. There are other questions about socialdemographic variables (sex, socioeconomical status, etc.).

Originally posted by Marcos Almeida View Post

c) whether there is an overall result for the whole questionnaire;

There is no an overall result for the whole questionnaire. It is a multipurpose questionnaire. We have alcohol-related items, personality-items, leisure-items, etc. The intervention is hypothesized to change the personality items (i.e: asertivity, social skills, etc) and/or some alcohol-related scales (positive actitudes towards alcohol consumption, etc)

Originally posted by Marcos Almeida View Post

d) whether y1 and y2 may correlate with each other.

Suppose we want to test the change in self-esteem after the intervention. In that case, y₁ and y₂would be the averaged self-steem score from self-steem related items from Q1 and Q2, respectively.
Since most students will do both the Q1 and Q2, I can assume y₁and y₂are correlated.

Originally posted by Marcos Almeida View Post

d) whether y1 and y2 may correlate with each other.

Hazarding a guess, a SEM - or GSEM - model or a hierarchic "mixed" model may potentially accomplish the wishful task.

I will think about using SEM models. Regarding the "mixed" model, should I made itl in a similar way of my posted regression model (by substracting the mean of Q1)?
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#5

19 Feb 2019, 10:40

I tend to think that "big" questionnaires without dimensions as well as an overall measure can become a source of headaches...

That being said, it seems you may use GSEM so as to check covariates, Yvars as wel as adjusting for correlation between them.

Item response theory is strategy worth taking in consideration as well.

Best regards,

Marcos
Comment
PedroAntonio delaRosa

Join Date: Feb 2019

Posts: 5
#6

20 Feb 2019, 08:30

Originally posted by Carlo Lazzaro View Post

Pedro Antonio:
I share Marcos' opinion about the (surely inadvertent) fogginess of your query.
It is not clear whether pupils nested within classes that, in turn were nested within schools filled the questionaires during tw measurement issues (if that were the case, I would sponsor Marcos' advice about -mixed- model) or else.
In sum, providing more details could help.

Pupils are nested only by schools. They filled the questionnaries in january and april, using a computer software. Match codes were not used, so I can not match data from Q1 and Q2 at individual level.

Originally posted by Marcos Almeida View Post

I tend to think that "big" questionnaires without dimensions as well as an overall measure can become a source of headaches...

That being said, it seems you may use GSEM so as to check covariates, Yvars as wel as adjusting for correlation between them.

Item response theory is strategy worth taking in consideration as well.

Following your suggested GSEM route, I think It would be interesting to fit a basal model using the data from Q1.

Then, I could replicate the model with Q2 data for both intervention group control group. I can compare the differences between both groups using invariance analysis. Control group coefficients should not change a lot.
Comment

Announcement

Unpaired multivariable analysis model

Comment

Comment

Comment

Comment

Comment