Dear Stata users,
I am analysing the association between exposure to heavy metals in pregnancy and childhood outcomes. This question involves an analysis with one continuous outcome and three exposures (three different toxic metals). I am interested in whether there are any correlations between these three metals. However, the metal variables are somewhat correlated (0.10-0.50), and when including interaction terms in the model, some of the coefficients exhibit correlations > .70. In this analysis, PCA is not an option, since I am interested in the effect of the interactions between the variables.
which gives:
I played around with bayesian regression a bit, and ended up with mean centering the exposure variables and the following code :
which gave me the following outputs:

(The diagnostic plots all look similar to the ones above)

According to the outputs, does it seem that I have (mostly) got rid of the collinearity problem and obtained potentially useful results?
Best,
Kjell Weyde
I am analysing the association between exposure to heavy metals in pregnancy and childhood outcomes. This question involves an analysis with one continuous outcome and three exposures (three different toxic metals). I am interested in whether there are any correlations between these three metals. However, the metal variables are somewhat correlated (0.10-0.50), and when including interaction terms in the model, some of the coefficients exhibit correlations > .70. In this analysis, PCA is not an option, since I am interested in the effect of the interactions between the variables.
Code:
regress outcome c.logAs##c.logCd##c.logPb vif
Code:
Variable | VIF 1/VIF -------------+---------------------- logAs | 143.41 0.006973 logCd | 17.56 0.056961 c.logAs#| c.logCd | 122.00 0.008197 logPb | 4.84 0.206595 c.logAs#| c.logPb | 135.32 0.007390 c.logCd#| c.logPb | 17.51 0.057119 c.logAs#| c.logCd#| c.logPb | 114.29 0.008750 -------------+---------------------- Mean VIF | 79.27
Code:
bayes, gibbs: regress outcome c.c_logAs##c.c_logCd##c.c_logPb bayesstats ess bayesgraph diagnostics _all bayesgraph matrix _all
Code:
Model summary ----------------------------------------------------------------------------------------------- Likelihood: outcome ~ normal(xb_outcome,{sigma2}) Priors: {outcome:c_logAs} ~ normal(0,10000) (1) {outcome:c_logCd} ~ normal(0,10000) (1) {outcome:c.c_logAs#c.c_logCd} ~ normal(0,10000) (1) {outcome:c_logPb} ~ normal(0,10000) (1) {outcome:c.c_logAs#c.c_logPb} ~ normal(0,10000) (1) {outcome:c.c_logCd#c.c_logPb} ~ normal(0,10000) (1) {outcome:c.c_logAs#c.c_logCd#c.c_logPb} ~ normal(0,10000) (1) {outcome:_cons} ~ normal(0,10000) (1) {sigma2} ~ igamma(.01,.01) ----------------------------------------------------------------------------------------------- (1) Parameters are elements of the linear form xb_outcome. Bayesian linear regression MCMC iterations = 12,500 Gibbs sampling Burn-in = 2,500 MCMC sample size = 10,000 Number of obs = 784 Acceptance rate = 1 Efficiency: min = .9792 avg = .9977 Log marginal likelihood = -1166.9975 max = 1 ----------------------------------------------------------------------------------------------- | Equal-tailed | Mean Std. Dev. MCSE Median [95% Cred. Interval] ------------------------------+---------------------------------------------------------------- outcome | c_logAs | -.0134398 .0435736 .000436 -.013836 -.1000772 .0706683 c_logCd | .0280181 .0536855 .000537 .0278446 -.0759983 .1332477 | c.c_logAs#c.c_logCd | .0042652 .0625478 .000625 .0043268 -.1183597 .1272451 | c_logPb | -.0868145 .0832914 .000818 -.0873019 -.2495841 .0755957 | c.c_logAs#c.c_logPb | .0670613 .1036493 .001036 .0664074 -.1356799 .2720652 | c.c_logCd#c.c_logPb | -.1586534 .0944886 .000955 -.1576177 -.3435851 .02337 | c.c_logAs#c.c_logCd#c.c_logPb | -.0640544 .1351075 .001351 -.0640332 -.3267847 .2029934 | _cons | -.0330334 .0366403 .000366 -.0332321 -.1041869 .0400286 ------------------------------+---------------------------------------------------------------- sigma2 | .985651 .0500573 .000501 .9842162 .8922606 1.088576 ----------------------------------------------------------------------------------------------- Note: Default priors are used for model parameters. Efficiency summaries MCMC sample size = 10,000 --------------------------------------------------------------------- | ESS Corr. time Efficiency ------------------------------+-------------------------------------- outcome | c_logAs | 10000.00 1.00 1.0000 c_logCd | 10000.00 1.00 1.0000 | c.c_logAs#c.c_logCd | 10000.00 1.00 1.0000 | c_logPb | 10000.00 1.00 1.0000 | c.c_logAs#c.c_logPb | 10000.00 1.00 1.0000 | c.c_logCd#c.c_logPb | 9792.00 1.02 0.9792 | c.c_logAs#c.c_logCd#c.c_logPb | 10000.00 1.00 1.0000 | _cons | 10000.00 1.00 1.0000 ------------------------------+-------------------------------------- sigma2 | 10000.00 1.00 1.0000 ---------------------------------------------------------------------
(The diagnostic plots all look similar to the ones above)
According to the outputs, does it seem that I have (mostly) got rid of the collinearity problem and obtained potentially useful results?
Best,
Kjell Weyde
Comment