Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • confusion about margins vs. predict

    I've been using margins for a while, and my coauthor asked if we could replicate the same results using predict. We tried using toy data and got confused. Here is an example:

    Code:
    //method 1
    qui regress sbp i.sex c.age c.weight
    margins i.sex
    Predictive margins Number of obs = 1,265
    Model VCE: OLS

    Expression: Linear prediction, predict()

    ------------------------------------------------------------------------------
    | Delta-method
    | Margin std. err. t P>|t| [95% conf. interval]
    -------------+----------------------------------------------------------------
    sex |
    Female | 130.6924 .7847317 166.54 0.000 129.1529 132.232
    Male | 130.6197 .8434945 154.86 0.000 128.9649 132.2745
    ------------------------------------------------------------------------------



    Code:
    //method 2
    qui regress sbp i.sex c.age c.weight
    predict yhat3, xb
    mean yhat3, over(sex)
    Mean estimation Number of obs = 1,265

    --------------------------------------------------------------
    | Mean Std. err. [95% conf. interval]
    -------------+------------------------------------------------
    c.yhat3@sex |
    Female | 128.3052 .5653357 127.1961 129.4143
    Male | 133.3508 .5569872 132.2581 134.4436
    --------------------------------------------------------------



    Code:
    //method 3
    qui regress sbp i.sex c.age c.weight
    replace sex = 0
    predict yhat0, xb
    replace sex = 1
    predict yhat1, xb
    mean yhat0 yhat1
    Mean estimation Number of obs = 1,265

    --------------------------------------------------------------
    | Mean Std. err. [95% conf. interval]
    -------------+------------------------------------------------
    yhat0 | 130.6924 .4043758 129.8991 131.4858
    yhat1 | 130.6197 .4043758 129.8263 131.413
    --------------------------------------------------------------



    We thought margins (method 1) should match the results of method 2, but in fact, it's method 3 that matches margins.

    I wonder what it means to predict outcomes by assuming everyone is male and everyone is female (method 3). Can someone explain what margins does here?

    Thanks!



  • #2
    What you did in method 2 is calculate the mean sbp in each sex, with regression adjustment for the effects of age and weight within each sex. However, in a real-world sample of adults, women would typically be both older and lighter than men. This calculation does not take the difference in distributions of age and weight between men and women into account. So those results are not fully comparable to each other. They are apples and oranges because differences in the distributions of age and weight across the two sexes are not adjusted for.

    What -margins- does, and what you emulated in method 3 is use the entire sample distribution of weight and age to adjust the estimation of mean sbp in men and women. Here both men and women's sbp are being fully adjusted for the effects of weight and age in the same way and adjusted to the same distribution. They are apples and apples and are fully comparable. There is no unadjusted difference in the distributions of weight and age contaminating these results.

    Comment


    • #3
      Thank you so much, !

      This makes so much sense now!

      Comment

      Working...
      X