Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adjusting a distribution

    Assume we are having observational data. There is a continuous outcome variable and a grouping variable. I would like to plot the distribution of the outcome variable by group but I want to adjust them for potential confounders, so that the final distributions are net of these other variables. In other words, I want to plot the distributions of the three groups if the adjusting variables were highly similar in all groups. What is the best way to do this? My first idea was an OLS regression and then predict values and plot them, like:

    Code:
    reg outcome i.group conf1 conf2 conf3
    predict adjusted
    histogram adjusted if group == 1 ...
    Is this sufficient? Would I need interaction terms in the regression model? I thought about balancing approaches as well but this will only work for 2 groups, right?

    Best wishes

    Stata 18.0 MP | ORCID | Google Scholar

  • #2
    I believe your approach will not provide the results you are looking for. The mean of the predicted values for each group will be exactly the same as the observed group means.

    You might want to go in the direction of potential outcomes in the sense of teffects ra estimator. That is, fit separate regression models for each group, then plot the predicted values from those different models:

    Code:
    regress outcome conf1 conf2 conf3 if group==#1
    predict adjusted#1
    regress outcome conf1 conf2 conf3 if group==#1
    predict adjusted#2
    ...
    This approach might get you closer to adjusted means. I am not quite sure about the second (and higher) moments, i.e, the variance of the predicted values.

    Comment


    • #3
      Thanks Daniel (hope to see you tomorrow!). At first I thought this is maybe a silly question but apparently not. What I read from our approach is that this is basically a model with all interaction terms included. I have set up a toy example to test this:
      Code:
      webuse nhanes2, clear
      
      
      reg bpsystol i.region##(c.bmi c.age)
      predict p_all
      
      
      reg bpsystol c.bmi c.age if region == 1
      predict p_g1
      
      reg bpsystol c.bmi c.age if region == 2
      predict p_g2
      
      
      compare p_all p_g1 if region == 1
      compare p_all p_g2 if region == 2
      So I guess your approach is very similar to mine in the end. I really wonder tho how legit it is. Is anyone aware of how to balance the other and higher means? I know, for example, that with kmatch and entropy balancing you can balance the first three moments and covariances. The only problem is that this only allows me to compare two groups (or all pairwise contrasts), which is not nice for more than 2 groups. Any other advice is highly welcome!


      Best wishes

      Stata 18.0 MP | ORCID | Google Scholar

      Comment

      Working...
      X