I was surprised to find that, with weight (or aweight), margins does not seem to compute the average partial (marginal) effect using the WLS estimates. And it's not obvious to me what it is doing. I don't think it can be correct. If it's basing the APEs on the weighted equation, or somehow using a weighed average, this is incorrect when using weights for heteroskedasticity. I can obtain the correct APEs by centering by hand and it's often very different from what Stata reports. The "by hand" APEs are, in the few examples I've tried, much closer to the APEs produced using OLS. The following uses the data set 401kpart, which can be found here: https://www3.nd.edu/~rwilliam/statafiles/
Below is my code and the Stata output, which implements the WLS estimator suggested in Chapter 8 of my introductory econometrics book. The APE of mrate using OLS is 0.1597905, the same as when I use centering (not shown). My calculation using WLS is 0.1591003 but the Stata margins command returns 0.1059637. I can't figure out where the last number comes from and I'm convinced it can't have much justification.
As a general comment, the formula for the APEs is free of the method of estimation. In the equation
prate = b0 + b1*mrate + b2*age + b3*mrate^2 + b4*mrate*age + u
the APE of mrate is b1 + 2*b3*mrate_bar + b4*age_bar. The averages are just the usual sample averages of mrate and age across the sample. Then plug in the estimates of the bj coefficients. The only difference should be that OLS is used in one case and WLS in the others to obtain the bj^. When this is done, the APEs are very close. This also gives the same answer as centering. The one I can't explain is what is reported by margins after using weights.
Final comment: This weighting here has nothing to do with a nonrandom sampling scheme, where one could debate how to properly compute the APEs.
Below is my code and the Stata output, which implements the WLS estimator suggested in Chapter 8 of my introductory econometrics book. The APE of mrate using OLS is 0.1597905, the same as when I use centering (not shown). My calculation using WLS is 0.1591003 but the Stata margins command returns 0.1059637. I can't figure out where the last number comes from and I'm convinced it can't have much justification.
As a general comment, the formula for the APEs is free of the method of estimation. In the equation
prate = b0 + b1*mrate + b2*age + b3*mrate^2 + b4*mrate*age + u
the APE of mrate is b1 + 2*b3*mrate_bar + b4*age_bar. The averages are just the usual sample averages of mrate and age across the sample. Then plug in the estimates of the bj coefficients. The only difference should be that OLS is used in one case and WLS in the others to obtain the bj^. When this is done, the APEs are very close. This also gives the same answer as centering. The one I can't explain is what is reported by margins after using weights.
Final comment: This weighting here has nothing to do with a nonrandom sampling scheme, where one could debate how to properly compute the APEs.
Code:
. qui sum mrate . scalar mrate_bar = r(mean) . qui sum age . scalar age_bar = r(mean) . reg prate c.mrate c.age c.mrate#c.mrate c.mrate#c.age, vce(r) Linear regression Number of obs = 4,075 F(4, 4070) = 176.40 Prob > F = 0.0000 R-squared = 0.1314 Root MSE = .17482 --------------------------------------------------------------------------------- | Robust prate | Coefficient std. err. t P>|t| [95% conf. interval] ----------------+---------------------------------------------------------------- mrate | .2441939 .0198724 12.29 0.000 .2052332 .2831546 age | .0053887 .0004146 13.00 0.000 .0045759 .0062015 | c.mrate#c.mrate | -.0513863 .0104353 -4.92 0.000 -.0718452 -.0309275 | c.mrate#c.age | -.0044911 .0004759 -9.44 0.000 -.0054241 -.0035581 | _cons | .7239956 .0076064 95.18 0.000 .7090829 .7389084 --------------------------------------------------------------------------------- . di _b[mrate] + 2*_b[c.mrate#c.mrate]*mrate_bar + _b[c.mrate#c.age]*age_bar .15979052 . margins, dydx(mrate) Average marginal effects Number of obs = 4,075 Model VCE: Robust Expression: Linear prediction, predict() dy/dx wrt: mrate ------------------------------------------------------------------------------ | Delta-method | dy/dx std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- mrate | .1597905 .0101427 15.75 0.000 .1399053 .1796757 ------------------------------------------------------------------------------ . predict uh, resid . gen luhsq = log(uh^2) . reg luhsq c.mrate c.age c.mrate#c.mrate c.mrate#c.age Source | SS df MS Number of obs = 4,075 -------------+---------------------------------- F(4, 4070) = 129.65 Model | 2078.54764 4 519.63691 Prob > F = 0.0000 Residual | 16312.7746 4,070 4.00805272 R-squared = 0.1130 -------------+---------------------------------- Adj R-squared = 0.1121 Total | 18391.3222 4,074 4.51431572 Root MSE = 2.002 --------------------------------------------------------------------------------- luhsq | Coefficient Std. err. t P>|t| [95% conf. interval] ----------------+---------------------------------------------------------------- mrate | -1.886746 .2149019 -8.78 0.000 -2.308071 -1.465421 age | -.0551837 .0055427 -9.96 0.000 -.0660505 -.0443168 | c.mrate#c.mrate | -.0586135 .1333664 -0.44 0.660 -.3200846 .2028576 | c.mrate#c.age | .0576433 .0076463 7.54 0.000 .0426524 .0726342 | _cons | -3.727866 .0693438 -53.76 0.000 -3.863817 -3.591914 --------------------------------------------------------------------------------- . predict gh (option xb assumed; fitted values) . gen hh = exp(gh) . reg prate c.mrate c.age c.mrate#c.mrate c.mrate#c.age [w = 1/hh], vce(r) (analytic weights assumed) (sum of wgt is 684,708.534450744) Linear regression Number of obs = 4,075 F(4, 4070) = 158.57 Prob > F = 0.0000 R-squared = 0.1562 Root MSE = .14373 --------------------------------------------------------------------------------- | Robust prate | Coefficient std. err. t P>|t| [95% conf. interval] ----------------+---------------------------------------------------------------- mrate | .2386863 .0190309 12.54 0.000 .2013753 .2759974 age | .0039402 .0004194 9.39 0.000 .0031178 .0047625 | c.mrate#c.mrate | -.0611719 .010162 -6.02 0.000 -.0810951 -.0412488 | c.mrate#c.age | -.0027945 .000486 -5.75 0.000 -.0037474 -.0018417 | _cons | .7368633 .0072664 101.41 0.000 .7226171 .7511094 --------------------------------------------------------------------------------- . di _b[mrate] + 2*_b[c.mrate#c.mrate]*mrate_bar + _b[c.mrate#c.age]*age_bar .15910026 . margins, dydx(mrate) Average marginal effects Number of obs = 4,075 Model VCE: Robust Expression: Linear prediction, predict() dy/dx wrt: mrate ------------------------------------------------------------------------------ | Delta-method | dy/dx std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- mrate | .1059637 .0050276 21.08 0.000 .0961069 .1158206 ------------------------------------------------------------------------------ . gen mrate_dm = mrate - mrate_bar . gen age_dm = age - age_bar . reg prate c.mrate c.age c.mrate_dm#c.mrate_dm c.mrate_dm#c.age_dm [w = 1/hh], vce(r) (analytic weights assumed) (sum of wgt is 684,708.534450744) Linear regression Number of obs = 4,075 F(4, 4070) = 158.57 Prob > F = 0.0000 R-squared = 0.1562 Root MSE = .14373 --------------------------------------------------------------------------------------- | Robust prate | Coefficient std. err. t P>|t| [95% conf. interval] ----------------------+---------------------------------------------------------------- mrate | .1591003 .009931 16.02 0.000 .1396301 .1785704 age | .0026449 .000268 9.87 0.000 .0021195 .0031703 | c.mrate_dm#c.mrate_dm | -.0611719 .010162 -6.02 0.000 -.0810951 -.0412488 | c.mrate_dm#c.age_dm | -.0027945 .000486 -5.75 0.000 -.0037474 -.0018417 | _cons | .7606102 .0054198 140.34 0.000 .7499845 .7712359 ---------------------------------------------------------------------------------------
Comment