Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is this odd-ratio time-dependent?

    Hello.

    I would please require some methodological advice about how to test the impact of COVID-19 regulations on a nationwide human cohort. It would be very kind if you could advice me which method would be appropriate in this context.

    Let M be the index month when a disease of interest is detected. M is a function M(i) of the individual human i. For each index month M(i) ranging from January 2011 to December 2022, my cohort contains an average of about 10000 new cases subject. Each human case subject is matched to a human control subject by age and gender.

    Hence there are nearly 3 millions individual humans in this whole cohort. I hope calling people "individual humans", "human case subject" and "human control subject" is not offensive. I apologize if it is offensive.

    The follow up for each individual human starts on January of the year before the year of the index month of the case subject and stops in December 2022. For instance, if the index month is May 2013, the follow up starts on January 2012. We observe nearly half the cases subjects and a third of the control subjects die during this follow up.

    For each individual human i, whether case or control subject, we have the monthly observation of 9 binary variables X1 to X9. Let x1(t,i) to x9(t,i) be the binary values observed for the binary variables X1 to X9 at the month t for the individual human i, if the individual human i is still alive at month t. Previous studies have shown that the values took by X1 to X9 at month t have a non-linear relation with the time elapsed between the index month M and the current month t.

    Here are the statistical hypothesis we would like to test. Null hypothesis H0: the odd ratio OR(X1|M,X1,...,X9) of X1 for cases subjects relatively to their matched control, given the index month M and the observed values of the binary variables X2 to X9, is independent from the current month t. Alternative hypothesis H1: the odd ratio OR(X1|M,X1,...,X9) of X1 for cases subjects relatively to their matched control, given the index month M and the observed values of the binary variables X2 to X9, is significantly bigger with 95% confidence when the current month t is greater or equal to March 2020 than when the current month t is strictly inferior to March 2020.

    We don't know what test statistic we shall use. It would be very kind if you could advice us which method would be appropriate for this subject. I was thinking of Cox model with shared frailty on the index month M(i) as a qualitative variable, including in the model a binary time dependent variable indicating if the current month is greater or equal to March 2020. Would you do the same? Would you know a better solution?

    Best regards.

    Axel Renoux, biostatistician, Toulouse university hospital
    Last edited by Axel Renoux; 25 Aug 2023, 06:33. Reason: Edit=minor language corrections.

  • #2
    I wonder if I am misunderstanding your description of the research question, as you are a biostatistician, yet the analysis you prose strikes me as only distantly related to the research question.

    I don't see the role for a Cox model here because survival is not mentioned in the research question--survival is just an attrition factor in your data. The research question you state concerns, instead, the odds ratio of X1 between cases and controls, and whether it is time dependent, specifically dependent on which of two eras, before March20 and thereafter, applies to the observation. So I would model this as a multi-level logistic regression, with X1 as the outcome variable, with key explanatory variables being case vs control, an indicator for the era beginning in March 2020, and their interaction. I would include the X2-X9 variables and M as covariates. The results for the interaction term will tell you whether there is this kind of time dependence in the case vs control odds ratio of X1. The second level of the model would be the person level, not M, with a random intercept.

    Added: What I have proposed here is a simplified model, and it does not take into consideration the possibility that the time-course of variable X1 may itself be associated with survival. In that case, survival-related attrition from the data set can severely bias the estimates it produces. But I think, if that is the case, you will need a more complicated approach. The simplest one that comes to my mind quickly would be a model of survival as a function of X1 (and perhaps M and X2-X9) and then weight the logistic model described above by inverse probability of survival. I am agnostic about what survival modeling approach would be most suitable, but a random effect or frailty at the person level, again, would be warranted.

    I imagine that there are other approaches as well, and that other Forum members will contribute their ideas.
    Last edited by Clyde Schechter; 25 Aug 2023, 09:41.

    Comment


    • #3
      Thank you very much Clyde Schechter! Your explanation is very clear and helpful.

      A multi-level logistic regression seems a good idea. But wouldn't the fact than many more subjects than control die during the follow up cause biais in a logistic regression?

      Comment


      • #4
        Your #3 crossed with my edits to #2. Yes, that is indeed a problem, and in the edited version of #2 I propose a solution to that. Take a look at what I added to my original #2.

        Comment

        Working...
        X