Here’s a clearer, more methodologically precise version of your question that you can use for a methods forum, CrossValidated, or when asking a collaborator:
Revised Question
Hello,
I am working with data from a two-wave survey in which a subset of participants were observed at both time points, while others were only observed in one wave due to attrition. In both Wave 1 and Wave 2, respondents were asked a binary question: “Have you ever used AI?” (Yes/No).
The number of respondents reporting AI use increased from 27 in Wave 1 to 71 in Wave 2. I am interested in identifying demographic and socioeconomic covariates (e.g., age, race/ethnicity, education) associated with AI use, as well as evaluating whether the association between these covariates and AI use differs across waves.
Because some individuals are observed at both time points (i.e., repeated measurements), while others are only observed in one wave, I am unsure whether a repeated-measures logistic regression model (e.g., using generalized estimating equations or a mixed-effects logistic model) would be appropriate for this analysis.
Specifically:
Revised Question
Hello,
I am working with data from a two-wave survey in which a subset of participants were observed at both time points, while others were only observed in one wave due to attrition. In both Wave 1 and Wave 2, respondents were asked a binary question: “Have you ever used AI?” (Yes/No).
The number of respondents reporting AI use increased from 27 in Wave 1 to 71 in Wave 2. I am interested in identifying demographic and socioeconomic covariates (e.g., age, race/ethnicity, education) associated with AI use, as well as evaluating whether the association between these covariates and AI use differs across waves.
Because some individuals are observed at both time points (i.e., repeated measurements), while others are only observed in one wave, I am unsure whether a repeated-measures logistic regression model (e.g., using generalized estimating equations or a mixed-effects logistic model) would be appropriate for this analysis.
Specifically:
- Is it methodologically appropriate to use a repeated-measures logistic regression model when the panel is unbalanced due to attrition (i.e., not all Wave 1 participants are present in Wave 2)?
- Would a population-averaged model (e.g., GEE) with AI use as the binary outcome and survey wave as a predictor be suitable for evaluating covariate associations with AI use over time in this context?
- If I keep only participants who were in both Wave1 and Wave2 then my sample size in Wave2 decreases from 1,384 to 720, I'm not sure if this loss is worth restricting the sample to an exact matched pair

Comment