Adjusting a distribution

Felix Bittmann

Join Date: Aug 2018

Posts: 838
#1

Adjusting a distribution

08 Jun 2022, 02:26

Assume we are having observational data. There is a continuous outcome variable and a grouping variable. I would like to plot the distribution of the outcome variable by group but I want to adjust them for potential confounders, so that the final distributions are net of these other variables. In other words, I want to plot the distributions of the three groups if the adjusting variables were highly similar in all groups. What is the best way to do this? My first idea was an OLS regression and then predict values and plot them, like:

Code:

reg outcome i.group conf1 conf2 conf3 predict adjusted histogram adjusted if group == 1 ...

Is this sufficient? Would I need interaction terms in the regression model? I thought about balancing approaches as well but this will only work for 2 groups, right?

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
Tags: None
daniel klein

Join Date: Mar 2014

Posts: 3911
#2

08 Jun 2022, 08:15

I believe your approach will not provide the results you are looking for. The mean of the predicted values for each group will be exactly the same as the observed group means.

You might want to go in the direction of potential outcomes in the sense of teffects ra estimator. That is, fit separate regression models for each group, then plot the predicted values from those different models:

Code:

regress outcome conf1 conf2 conf3 if group==#1 predict adjusted#1 regress outcome conf1 conf2 conf3 if group==#1 predict adjusted#2 ...

This approach might get you closer to adjusted means. I am not quite sure about the second (and higher) moments, i.e, the variance of the predicted values.
1 like
Comment
Felix Bittmann

Join Date: Aug 2018

Posts: 838
#3

09 Jun 2022, 00:04

Thanks Daniel (hope to see you tomorrow!). At first I thought this is maybe a silly question but apparently not. What I read from our approach is that this is basically a model with all interaction terms included. I have set up a toy example to test this:

Code:

webuse nhanes2, clear reg bpsystol i.region##(c.bmi c.age) predict p_all reg bpsystol c.bmi c.age if region == 1 predict p_g1 reg bpsystol c.bmi c.age if region == 2 predict p_g2 compare p_all p_g1 if region == 1 compare p_all p_g2 if region == 2

So I guess your approach is very similar to mine in the end. I really wonder tho how legit it is. Is anyone aware of how to balance the other and higher means? I know, for example, that with kmatch and entropy balancing you can balance the first three moments and covariances. The only problem is that this only allows me to compare two groups (or all pairwise contrasts), which is not nice for more than 2 groups. Any other advice is highly welcome!

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
Comment

Announcement

Adjusting a distribution

Comment

Comment