Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Box-Cox regression interpretation

    Hi StataList,

    Could you please help me to interpret the coefficients of the following model.

    Code:
    DAP0.2564817=-2.993313+0.9320813*Weight0.2737106+ 0.2308375*DAFrames0.2737106+0.0999711*FluoroFrames0.2737106

    I need to say something like this:
    for any a% increase/decrease in weight, the expected ratio of the DAP will be B.
    or
    we expect about c% increase/decrease in DAP when weight increases/decreases by d%.



    I did the following Box-Cox regression:
    Code:
    boxcox DAP Weight DAFrames FluoroFrames , model(theta) lrtest
    the results were:
    Code:
                                                      Number of obs   =      2,811
                                                      LR chi2(4)      =    2775.64
    Log likelihood = -10360.897                       Prob > chi2     =      0.000
     
    ------------------------------------------------------------------------------
             DAP |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
         /lambda |   .2737106   .0475931     5.75   0.000     .1804299    .3669914
          /theta |   .2564817   .0169948    15.09   0.000     .2231725    .2897908
    ------------------------------------------------------------------------------
     
    Estimates of scale-variant parameters
    -------------------------------------------------------------
                 |      Coef.  chi2(df)  P>chi2(df)    df of chi2
    -------------+-----------------------------------------------
    Notrans      |
           _cons |  -10.64568
    -------------+-----------------------------------------------
    Trans        |
          Weight |   .9946927  1273.050   0.000           1
        DAFrames |   .2463437   508.567   0.000           1
    FluoroFrames |   .1066865   876.712   0.000           1
    -------------+-----------------------------------------------
          /sigma |   .9905126
    -------------------------------------------------------------
     
    ---------------------------------------------------------------
       Test               Restricted    
        H0:             log likelihood       chi2       Prob > chi2
    ---------------------------------------------------------------
    theta=lambda = -1     -13074.245      5426.70           0.000
    theta=lambda =  0     -10475.138       228.48           0.000
    theta=lambda =  1     -11315.795      1909.80           0.000
    ---------------------------------------------------------------
    then I did
    Code:
    gen DAP1=DAP^.2564817
    gen Weight1=Weight^.2737106
    gen DAFrames1=DAFrames^.2737106
    gen FluoroFrames1=FluoroFrames^.2737106
    
    regress DAP1 Weight1 DAFrames1 FluoroFrames1
    the results were:
    Code:
          Source |       SS           df       MS      Number of obs   =     2,811
    -------------+----------------------------------   F(3, 2807)      =   1589.88
           Model |  308.273108         3  102.757703   Prob > F        =    0.0000
        Residual |  181.423564     2,807  .064632549   R-squared       =    0.6295
    -------------+----------------------------------   Adj R-squared   =    0.6291
           Total |  489.696672     2,810  .174269278   Root MSE        =    .25423
    
    -------------------------------------------------------------------------------
             DAP1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    --------------+----------------------------------------------------------------
          Weight1 |   .9320813   .0232091    40.16   0.000     .8865728    .9775899
        DAFrames1 |   .2308375   .0097467    23.68   0.000      .211726    .2499489
    FluoroFrames1 |   .0999711     .00311    32.15   0.000      .093873    .1060692
            _cons |  -2.993313   .0857925   -34.89   0.000    -3.161536    -2.82509
    -------------------------------------------------------------------------------


    Regards,

  • #2
    There's no simple way to estimate the output. In 1992, International Economic Review, I published a paper "Some Alternatives to the Box-Cox Regression Model." There I pointed out that there is no simple way to recover E(y|x) from the Box-Cox estimates. And even if you make strong assumptions, the resulting estimates and standard errors will be nonrobust. I suggested modeling E(y|x) using an inverse Box-Cox form, and possibly transforming some covariates, too. One gets direct estimates of the partial effects, and my paper discusses how to compute, say, an elasticity. Estimation by nonlinear or weighted nonlinear least squares is fairly easy. It didn't seem to catch on even though I still don't know how people interpret the Box-Cox estimates. Most, I suspect, violate Jensen's inequality and just act as if the mean passes through the nonlinear function.

    Currently, my view is that an exponential model with flexible functions of the covariates -- squares, interactions -- is often sufficient. The coefficients are easy to interpret; in fact, you'd get the elasticity directly by using the logs of the explanatory variables. The estimates are invariant to rescaling, and you can use the Poisson or Gamma QMLEs (available using the Stata -glm- command) along with fully robust inference. The Box-Cox model maintains lots of assumptions (homoskedasticity, normality). Those Box-Cox standard errors are likely very misleading, as they're computed under the assumption that the entire distribution is correct.

    Incidentally, coding my suggested conditional mean function using the -nl- command in Stata should not be too hard. And Stata will give the robust standard errors.

    Comment

    Working...
    X