Dealing with Missing Data in Observations of Longitudinal Data

Rana Madani Civi

Join Date: Nov 2023

Posts: 1
#1

Dealing with Missing Data in Observations of Longitudinal Data

15 Nov 2023, 16:44

Hello everyone,

I'm currently working with a dataset comprising over 100 variables, around 25,000 observations with three time points (baseline, FU1, FU2). The dataset exhibits various types of missingness, each associated with different codes:
"Don't know" is coded as 98.

"Skipped pattern" is coded as -99999.

"Refused" is coded as 99.

"Missing" is coded as -88888

Some missing cases involve participants who were lost to follow-up, either due to mortality or other reasons, these lost to follow up might have occurred in follow up 1 or 2. Additional note, certain variables related to sex, such as menopause, were skipped by male participants, which is not considered true missingness as these variables are not applicable to male respondents.

As I prepare to run the MCAR (Missing Completely at Random) test in Stata, I'm considering how to handle these different types of missing values. Specifically, I'm unsure whether to standardize all missing values to "." or retain some of the existing codes. Thus, my question is how I should handle codes 98 and 99.

Furthermore, my dataset spans three timepoints: baseline, follow-up 1, and follow-up 2. Also, should I include my main exposure and outcome variables for all three time periods, with covariables only at the baseline when I want to run the test? Or, should I include all three time points for covariables?

I would appreciate any insights or recommendations on these matters.

Thank you.
Tags: None

Announcement

Dealing with Missing Data in Observations of Longitudinal Data