Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Baseline Balance

    Dear STATAlist,
    I am planning to analyze a data from an RCT. However, when I check the balance table, the result of the baseline characteristics of some outcome variables are not balanced (significantly different from 0 between the treatment and control group). Is there anything that I can do to tackle this issue? And with unbalanced baseline characteristics, does it mean that my regression result cannot be interpreted properly? Thank you!

  • #2
    based on what you have written, all it means is that, as expected, there is some random variation - nothing in your planned analysis needs changes because of this; I assume here that you used a p-value of 0.05 as your cutoff - in that case, you would, even with random variation, expect some "statistically significant" results if you did enough tests (you don't say how many you did)

    Comment


    • #3
      Statistical significance is not an appropriate criterion for determining balance between groups in an RCT. In fact, specifically in the context of an RCT, unless you can identify some way in which the randomization protocol was incorrect or not correctly implemented, any "statistically significant" p-value is, by definition, a type 1 error.

      The relevant criterion is: is the difference between the groups large enough to materially affect the outcome variable in the two groups. Depending on the variable and the strength of its association with the outcome, you can have non-statistically significant differences that are nevertheless problems, and you can also have statistically significant differences that are of no importance at all in this regard. So you need to look at the differences between groups from that perspective: are they large enough to materially affect the outcome variable in the groups. And you need to do this for all of the variables, whether the difference is "statistically significant" or not.

      There are a few approaches that can be used to deal with imbalances (assuming you really do have imbalances). Probably the simplest and most common is to represent the offending variable(s) in the analysis by including them (or transforms of them) as covariates in the analysis. This approach has the advantage that it can be easily applied, provided your sample size is large enough, even if a large number of variables are implicated. The major limitation is that this approach, while it may reduce the confounding effect of the offending variables, may not entirely eliminate it altogether if the way you represent those variables in the analysis is not reflective of the true data generating process.

      Another approach that is simple to implement if there is only one, or a very small number, of offending variables is to stratify the analysis on that (those) variable(s). If the findings are essentially the same in all of the strata, then you have dealt with the problem. If they vary, then you have actually gained information: you have uncovered an interaction.

      Another approach is to match on these variables. The problem with that is that this is likely to require dropping some observations that cannot be matched with any in the other group. This may introduce a bias that is just as bad as the confounding effect of the variable that you were trying to eliminate. If, however, you can come up with an adequate matching scheme that does not lead to excluding any of the data, it is a highly effective way to eliminate confounding bias.

      Added: Crossed with #2.

      Comment


      • #4
        well, although what Clyde Schechter says is sensible, in a sense, I do not agree that analysis plan decisions should be based on whether there is an imbalance (no matter how defined - though I agree that I would not use a p-value to make such a determination) - all those decisions should be made in advance - if the imbalances are so severe that they call into question the adequacy of the randomization, then you might have to re-think your analysis because you might no longer believe that you have a randomized study

        Comment


        • #5
          all those decisions should be made in advance - if the imbalances are so severe that they call into question the adequacy of the randomization, then you might have to re-think your analysis because you might no longer believe that you have a randomized study
          When doing an observational study, I agree that decisions about what covariates to include should be made in advance. But in a randomized study, the advance plan would normally be to include no covariates at all. The expectation is that the sample size has been chosen large enough that there probably will be no imbalances large enough to matter. But probably is not certainly, and occasionally an unexpected imbalance arises in a randomized study. What to do then? I see two different approaches which depend on the meta-statistical underpinnings of your analysis.

          If you are doing Neyman-Pearson style null hypothesis significance testing (NHST), you just ignore the imbalance and continue to do the crude analysis of outcome vs treatment assignment. The logic behind this is that you are going to either reject the null hypothesis or not, based on the p-value. The logic underlying null hypothesis rejection is based on the sampling distribution of the test statistic conditional on the null hypothesis. That distribution has already "priced in" the possibility of imbalance on other variables. That is, part of the critical region of the sampling distribution is, in fact, supported by these unbalanced samples. So you will still have a .05 (or whatever alpha you pre-selected) Type 1 error rate, and power is similarly unaffected for the same reason. True, your effect estimate may be biased, but parameter estimation is not the purpose of NHST; staying within the Type 1 and Type 2 error bounds is. So the imbalance is irrelevant in this setting.

          If, on the other hand, your goal is not to test null hypotheses but to estimate effects, that is an entirely different kettle of fish, because the omission of a sample-level confounder will lead to biased effect estimates. For that reason, you must do something to adjust for the effects of the confounder in this setting. True, this introduces "investigator degrees of freedom" into the analysis and this process can be easily abused to manipulate findings. But your choice is between that and proceeding with a plan that you know, a posteriori, will not achieve your research goal.

          Finally, notwithstanding my earlier remark that in a randomized study, any "statistically significant" imbalance is necessarily a Type I error, I do think that this is one of the situations where p-values and null hypothesis significance testing are valid and useful: the hypothesis that the study was properly randomized is a legitimate, non-straw-man null hypothesis, and it is either true or false. There is no parameter to be estimated. If the number of imbalances is statistically significantly more than expected given the number of variables observed, then even in the absence of prior evidence of flawed randomization, I would take that as a strong signal that something (unrecognized up to this point) has gone wrong with the randomization and would do an in-depth investigation of what happened. If, after that, I remained truly confident that the randomization was properly implemented and I just got a really unlucky sample, then I would proceed with analysis, adjusting for the confounders. If I discovered that the randomization was broken, then I would see if there is some part of the data that were not affected and could be salvaged for analysis. (E.g., in a multi-center study, perhaps one center messed up the randomization, but the others were fine.) And if nothing can be saved, I would abandon the study.

          Comment


          • #6
            on some things we're clearly just going to disagree - while I agree that an unadjusted analysis should be done for a randomized study, I also think that the analysis plan will generally call for the use of covariates, if only in an attempt to improve power

            Comment


            • #7
              Thank you Clyde and Rich for the extensive answer.
              My next question is that Clyde mentioned "The relevant criterion is: is the difference between the groups large enough to materially affect the outcome variable in the two groups. Depending on the variable and the strength of its association with the outcome, you can have non-statistically significant differences that are nevertheless problems, and you can also have statistically significant differences that are of no importance at all in this regard. So you need to look at the differences between groups from that perspective: are they large enough to materially affect the outcome variable in the groups. And you need to do this for all of the variables, whether the difference is "statistically significant" or not."
              So how do one determine whether the difference between the groups large enough to materially affect the outcome variable in the two groups? Just to give a brief overview, this special RCT was designed to see the long-term impact of a cash transfer program. Hence, the researchers only focused on the balance of related outcomes (health and education). Now, I am using the data to see the impact of the cash transfer on bargaining power. Hence, it is expected that the baseline characteristics of Treatment and control group for bargaining power may not be balanced.

              Comment


              • #8
                Just to give a brief overview, this special RCT was designed to see the long-term impact of a cash transfer program. Hence, the researchers only focused on the balance of related outcomes (health and education). Now, I am using the data to see the impact of the cash transfer on bargaining power. Hence, it is expected that the baseline characteristics of Treatment and control group for bargaining power may not be balanced.
                I

                As I understand it you say that a previous trial has randomized people into recipients and non-recipients of a cash transfer program and the outcome variables studied were health and education. Presumably other variables were measured as well, though you don't say what they are. Let's just call them X1, X2, etc. Possibly during the original analyses somebody looked into the balance of X1, X2, etc. between the recipients and non-recipients. Now you are proposing to use this same trial but examine a new outcome, namely bargaining power. But the same people are in the two groups as before. And the X1, X2, variables haven't changed. So the group differences in these X's are already known. Let's call them d1, d2, etc. The questions that now must be asked are, as a difference of d1 in X1 large enough to correspond to a material difference in bargaining power. To answer that question, you need to know how strongly associated each of the X variables is associated with bargaining power. Measures of association include regression coefficients (for continuous variables), or risk ratios for discrete variables. So you estimate those things in your sample, and from that you can get an estimate of how much a difference of d1 in X1 amounts to in terms of a difference in bargaining power. You then apply some judgment as to how large that estimated difference in bargaining power is in the context of your study. In particular, if you are trying to estimate a very small effect of cash transfer on bargaining power, then the estimated difference in bargaining power due to a difference of d1 in X1 had better be much smaller still to let it pass: otherwise you need to adjust for the effects of X1. But if the effect of cash transfer on bargaining power is a large one, then you can be more lenient in not adjusting for some imbalances.

                You don't necessarily need to do calculations for every variable. In some instances it may already be well known that the association of X with bargaining power is very weak or close to non-existent. In that case, anything but a gigantic difference in X across treatment groups can be ignored. But when little is known, calculating a quick estimate of the association of X with bargaining power in your sample can support a back of the envelope calculation of how much difference the observed difference in X is likely to make in bargaining power and how large or small that is compared to the effect of cash transfer on bargaining power that you are trying to estimate.

                Comment


                • #9
                  This is perfect, thank you @Clyde for the answer!

                  Comment

                  Working...
                  X