Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multiple Regression - comparing adjusted means

    Hello all,

    I have a dataset from the medical intudstry. An example of the data you can see in the attached table (image).

    Click image for larger version

Name:	stata table 1.JPG
Views:	1
Size:	46.6 KB
ID:	1391019


    I have 3 dependent variables, all continous, some transformed due to non-normality, some not. In addition, I have two independent variables, which are the variables to be testes (the main research variables). Both came from a factor analysis and were categorized. The two first factor scores where categorized to quartiles (Q1, Q2, Q3 or Q4).

    In addition, I have several covariates, which I need for adjustment. Such covariates are age, BMI, smoking habbits, etc.

    The main target of the analysis is to look for differences in the means of all DV's by groups of both IV's, meaning, for each DV to see if there are differences between the different quartiles for each IV.

    I have several questions, which I hope you can assist me with, some of them are more technical, some methodological. I am using Stata 14.2.

    1) I need a regression model, which will accept both continuous and categorical IV's and covariates, and will yield the adjusted means. I am not sure which Stata procedure to use here.

    2) In comparing the quartiles (for each IV), I am interested in checking if there is a trend (for example, if the adjusted mean of Q4> mean of Q3> mean of Q2> mean of Q1). Is there a test that does that in Stata? In addition, I will be interested in simply comparing a couple of quartiles (mainly Q4 vs Q1). How should I get Stata to do that, to makes pair comparisons, while correcting the significance level?

    3) Methodologically, or theoretically, I need to adjust for some covariates, such as age and BMI. Since I have quite a few of them, putting too many variables in the model doesn't risk the effect of the IV's to be gone completely? Is there a risk of "over-adjusting", which will make effects to vanish? What is the best way to handle it ? Does Stata knows how to choose the best subset (including two way interactions) ?


    Thank you in advance for any tips you can give me.

  • #2
    I gather you will need to select the - manova - as well as the - mvreg - command.

    You may find helpful examples in the Stata Manual.
    Best regards,

    Marcos

    Comment


    • #3
      You'll have a greater chance of a useful answer if you follow the FAQ on asking questions -- provide Stata code in code delimiters, Stata output, and sample data using dataex. You're asking a bunch of vague questions making precise answers difficult.

      To extend Marco's comment, I didn't see where you wanted to integrate the estimation of the three dvs. Do you want three separate models or one integrated model? manova and mvreg deal with integrated models with multiple dvs. These models make specific assumptions about how the x's influence the y's (e.g., all x's may influence y's through one unobserved variable).

      If you have separate regressions, you can do regression and then use the margins command to do your adjusted means. Seemingly unrelated regression (sureg) let's the errors covary across the three equations. [You may want margins even if you go with manova or mvreg.] You can test for differences in predicted values with tests after margins.

      The factor notation is used to indicate interactions. They make both specification and interpretation (via the margins and marginsplot commands) much easier.

      There are many things written about what variables to include as controls. There are also model selection criteria like BIC and AIC. Stata does have a stepwise procedure, but most of the respondents on this list are strongly opposed to stepwise procedures (you're running piles of regressions to pick some resulting in both over fitting and questions about all your statistics). It is possible that including too many controls (relative to your sample size) may simply make it hard to get precise estimates of any of the parameters. I'd look at the norms in your area to see what should be included in the controls.

      Comment

      Working...
      X