Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Different predictions of margins depending on whether # is used or the option over

    Hello,

    Does anybody know why margins produce slightly different results depending whether the over option is used (and depending on the order) or not? See example below
    • first and second syntax produce identical results
    • third and fourth syntax differ from each other and from syntax 1 and 2
    How are the predictions constructed in these examples?

    Code:
    .
    . webuse lbw
    . reg bwt age lwt i.race i.smoke // Standard OLS regression
    
          Source |       SS           df       MS      Number of obs   =       189
    -------------+----------------------------------   F(5, 183)       =      6.38
           Model |  14831671.2         5  2966334.24   Prob > F        =    0.0000
        Residual |  85083627.4       183  464937.854   R-squared       =    0.1484
    -------------+----------------------------------   Adj R-squared   =    0.1252
           Total |  99915298.6       188  531464.354   Root MSE        =    681.86
    
    ------------------------------------------------------------------------------
             bwt |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             age |  -2.081952   9.816483    -0.21   0.832    -21.44999    17.28609
             lwt |   4.000682   1.737547     2.30   0.022     .5724807    7.428883
                 |
            race |
          black  |  -511.0851   157.0206    -3.25   0.001    -820.8886   -201.2816
          other  |  -400.3064   119.5332    -3.35   0.001    -636.1469    -164.466
                 |
           smoke |
         smoker  |  -401.1584   109.2028    -3.67   0.000    -616.6168      -185.7
           _cons |    2842.58   321.3459     8.85   0.000     2208.561      3476.6
    ------------------------------------------------------------------------------
    
    
    
    margins i.smoke#i.race
    
    Predictive margins                              Number of obs     =        189
    Model VCE    : OLS
    
    Expression   : Linear prediction, predict()
    
    ----------------------------------------------------------------------------------
                     |            Delta-method
                     |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -----------------+----------------------------------------------------------------
          smoke#race |
    nonsmoker#white  |   3313.569   92.75794    35.72   0.000     3130.556    3496.581
    nonsmoker#black  |   2802.484   145.2183    19.30   0.000     2515.966    3089.001
    nonsmoker#other  |   2913.262   86.80869    33.56   0.000     2741.988    3084.537
       smoker#white  |    2912.41    85.5863    34.03   0.000     2743.547    3081.273
       smoker#black  |   2401.325   153.4352    15.65   0.000     2098.596    2704.055
       smoker#other  |   2512.104   125.3539    20.04   0.000     2264.779    2759.429
    ----------------------------------------------------------------------------------
    
    . margins i.race#i.smoke
    
    Predictive margins                              Number of obs     =        189
    Model VCE    : OLS
    
    Expression   : Linear prediction, predict()
    
    ----------------------------------------------------------------------------------
                     |            Delta-method
                     |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -----------------+----------------------------------------------------------------
          race#smoke |
    white#nonsmoker  |   3313.569   92.75794    35.72   0.000     3130.556    3496.581
       white#smoker  |    2912.41    85.5863    34.03   0.000     2743.547    3081.273
    black#nonsmoker  |   2802.484   145.2183    19.30   0.000     2515.966    3089.001
       black#smoker  |   2401.325   153.4352    15.65   0.000     2098.596    2704.055
    other#nonsmoker  |   2913.262   86.80869    33.56   0.000     2741.988    3084.537
       other#smoker  |   2512.104   125.3539    20.04   0.000     2264.779    2759.429
    ----------------------------------------------------------------------------------
    
    . margins i.smoke, over(i.race)
    
    Predictive margins                              Number of obs     =        189
    Model VCE    : OLS
    
    Expression   : Linear prediction, predict()
    over         : race
    
    ----------------------------------------------------------------------------------
                     |            Delta-method
                     |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -----------------+----------------------------------------------------------------
          race#smoke |
    white#nonsmoker  |   3320.305   91.33457    36.35   0.000       3140.1    3500.509
       white#smoker  |   2919.146   85.72184    34.05   0.000     2750.016    3088.276
    black#nonsmoker  |   2873.984   140.1653    20.50   0.000     2597.436    3150.532
       black#smoker  |   2472.826   149.6606    16.52   0.000     2177.543    2768.108
    other#nonsmoker  |   2875.864   85.56818    33.61   0.000     2707.037    3044.691
       other#smoker  |   2474.706   122.3741    20.22   0.000      2233.26    2716.151
    ----------------------------------------------------------------------------------
    
    . margins i.race, over(i.smoke)
    
    Predictive margins                              Number of obs     =        189
    Model VCE    : OLS
    
    Expression   : Linear prediction, predict()
    over         : smoke
    
    ----------------------------------------------------------------------------------
                     |            Delta-method
                     |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -----------------+----------------------------------------------------------------
          smoke#race |
    nonsmoker#white  |   3317.515   92.33265    35.93   0.000     3135.342    3499.688
    nonsmoker#black  |    2806.43   144.9946    19.36   0.000     2520.354    3092.506
    nonsmoker#other  |   2917.208    87.2106    33.45   0.000     2745.141    3089.276
       smoker#white  |   2906.277   85.74155    33.90   0.000     2737.108    3075.447
       smoker#black  |   2395.192    153.452    15.61   0.000      2092.43    2697.955
       smoker#other  |   2505.971   124.5359    20.12   0.000      2260.26    2751.682
    Thank you,
    Mike

  • #2
    That the results of #3 and #4 are somewhat similar to those of #1 and #2 is almost coincidental (I'm exaggerating). Those commands ask for very different things.

    1 and 2 are asking for the expected values of bwt in each combination of smoke and race, averaged over the entire estimation sample with age and lwt preserved at their observed values.

    3 asks for something rather different. First of all, the results are not computed in the entire sample. The results for nonsmoker#white, for example, are computed using only the observations for white non-smokers. Since the distribution of age and lwt differ across the various combinations of smoke and race, the adjustment when -over()- is used is only a partial adjustment. In the first two -margins- commands you get an adjustment for the complete joint distribution of lwt and age.

    Comment


    • #3
      Thank you Clyde Schechter ... but I'm still a bit puzzled: if the predictions for nonsmoker#white in #3 and #4 are calculated using only the observations for white non-smokers, why aren't the results identical? And shouldn't t-stats for example be worse because of the smaller sample size in these subgroups?

      If I predict the outcome and ask for the average of each combination, I get other results:

      Code:
      . qui reg bwt age lwt i.race i.smoke // Standard OLS regression
      . predict pred
      (option xb assumed; fitted values)
      
      . mean pred, over(smoke race)
      
      Mean estimation                        Number of obs   =        189
      
      -------------------------------------------------------------------
                        |       Mean   Std. Err.     [95% Conf. Interval]
      ------------------+------------------------------------------------
      c.pred@smoke#race |
       nonsmoker#white  |   3343.861   15.21561      3313.845    3373.876
       nonsmoker#black  |   2887.838   42.99267      2803.028    2972.648
       nonsmoker#other  |    2872.45   10.07794      2852.569     2892.33
          smoker#white  |   2899.214   17.13817      2865.406    2933.022
          smoker#black  |   2450.659   42.70376      2366.419    2534.899
          smoker#other  |   2490.356   50.04146      2391.641    2589.071
      -------------------------------------------------------------------

      Comment


      • #4
        I think my explanation about 3 and 4 was not clear. In 3, each of the calculations is done only over whites, only over blacks, or only over other race, but including both smokers and non-smokers (with the smoke variable alternately set to smoke in all such observations, or to non-smoke in all such observations.) In 4, each is done only over smokers or only over non-smokers (with the race variable alternately est to white, black, or other in all such observations.)

        Your conclusion about smaller sample size leading to larger t-statistics is incorrect because in these different subsets of the data, the outcome variance may differ enough to make the standard errors rank differently from the sample sizes.

        Comment


        • #5
          Thank you!

          Comment


          • #6
            Hello, Clyde Schechter

            I am running an MNL model on the Son-Father occupation
            Son and Father's Occupation have 5 categories
            Son's education has six distinct categories from 'no education' to 'graduate & post-graduate'.
            My Questions of interest are 1) whether 'identical educational attainment' result in the identical occupational outcome across the religion or not.
            2) Does the impact of parental occupation differs across religion.


            quietly mlogit son_occ son_age i.son_education hh_size ib(last).father_occupation i.religion religion#ib(last).father_occupation i.income_quartiles i.urban_dummy i.land_holding i.regions ,base(5)

            In case of question 1) I am calculating
            eststo margin: margins (son_education#religion) if religion==1 , predict(p outcome(1)) post
            and the same for religion==2 ,3,4,5, and 6

            Code:
            eststo margin: margins (son_education#religion) if religion==1 , predict(p outcome(1)) post
            Predictive margins                              Number of obs     =      2,398
            Model VCE: OIM
            
            Expression   : Pr(son_occ==professional), predict(p outcome(1))
            
            ----------------------------------------------------------------------------------------
                                   |            Delta-method
                                   |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
            -----------------------+----------------------------------------------------------------
            son_education#religion |
                              1 1  |   .0095672   .0015834     6.04   0.000     .0064639    .0126705
                              2 1  |   .0135712   .0020035     6.77   0.000     .0096445     .017498
                              3 1  |   .0231374   .0032032     7.22   0.000     .0168593    .0294155
                              4 1  |   .0411354   .0055372     7.43   0.000     .0302826    .0519882
                              5 1  |   .0718266   .0095199     7.54   0.000     .0531679    .0904854
                              6 1  |   .1795261   .0230031     7.80   0.000     .1344409    .2246113
            ----------------------------------------------------------------------------------------
            While on the other hand if I will go by the following way:

            Code:
            .  eststo margin: margins (off_gen_ed_1#category4)  , predict(p outcome(1)) post
            
            Predictive margins                              Number of obs     =     34,521
            Model VCE: OIM
            
            Expression   : Pr(off_occ==professional), predict(p outcome(1))
            
            ----------------------------------------------------------------------------------------
                                   |            Delta-method
                                   |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
            -----------------------+----------------------------------------------------------------
            son_education#religion |
                              1 1  |   .0275742   .0058968     4.68   0.000     .0160167    .0391318
                              1 2  |    .034824   .0041757     8.34   0.000     .0266398    .0430081
                              1 3  |   .0374419   .0037135    10.08   0.000     .0301636    .0447202
                              1 4  |   .0415213   .0040181    10.33   0.000      .033646    .0493965
                              1 5  |   .0444965   .0043931    10.13   0.000     .0358862    .0531067
                              1 6  |   .0335284   .0038382     8.74   0.000     .0260056    .0410511
                              2 1  |   .0356416   .0066882     5.33   0.000      .022533    .0487502
                              2 2  |   .0424971   .0039232    10.83   0.000     .0348079    .0501864
                              2 3  |   .0432661   .0030025    14.41   0.000     .0373813    .0491509
                              2 4  |   .0485333   .0032213    15.07   0.000     .0422197    .0548469
                              2 5  |   .0519902   .0037965    13.69   0.000     .0445493    .0594311
                              2 6  |   .0396933   .0034906    11.37   0.000     .0328519    .0465346
                              3 1  |   .0539305   .0087631     6.15   0.000     .0367551    .0711058
                              3 2  |   .0629097   .0048352    13.01   0.000     .0534329    .0723865
                              3 3  |   .0615132   .0032202    19.10   0.000     .0552017    .0678247
                              3 4  |    .069464   .0034321    20.24   0.000     .0627373    .0761908
                              3 5  |   .0741169   .0044761    16.56   0.000      .065344    .0828898
                              3 6  |   .0570878   .0040911    13.95   0.000     .0490693    .0651062
                              4 1  |     .08462   .0116782     7.25   0.000     .0617311    .1075088
                              4 2  |   .0989708   .0068054    14.54   0.000     .0856326    .1123091
                              4 3  |   .0934767   .0041862    22.33   0.000     .0852718    .1016815
                              4 4  |   .1061655   .0045913    23.12   0.000     .0971667    .1151643
                              4 5  |   .1123571   .0062883    17.87   0.000     .1000322     .124682
                              4 6  |   .0871644   .0054822    15.90   0.000     .0764195    .0979093
                              5 1  |   .1323411    .015664     8.45   0.000     .1016402    .1630419
                              5 2  |   .1576041   .0104095    15.14   0.000     .1372018    .1780064
                              5 3  |    .144538   .0065804    21.96   0.000     .1316406    .1574354
                              5 4  |   .1652934   .0075461    21.90   0.000     .1505034    .1800835
                              5 5  |     .17225   .0100163    17.20   0.000     .1526184    .1918816
                              5 6  |   .1344296    .008418    15.97   0.000     .1179305    .1509286
                              6 1  |   .2643057   .0258818    10.21   0.000     .2135783    .3150331
                              6 2  |   .3175695   .0178742    17.77   0.000     .2825366    .3526023
                              6 3  |   .2834013   .0116941    24.23   0.000     .2604813    .3063213
                              6 4  |   .3253775   .0125874    25.85   0.000     .3007067    .3500484
                              6 5  |   .3283856   .0176025    18.66   0.000     .2938854    .3628858
                              6 6  |   .2630969   .0159846    16.46   0.000     .2317677     .294426
            ----------------------------------------------------------------------------------------

            These margins are different from method #1.

            I am getting more desirable results by method #1.

            But, I want to hear from the experts, which is the best way to answer question 1).
            And, why both the methods resulting in two different types of margins.

            Thank You

            Comment

            Working...
            X