Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Logged Margins: what is "correct"?

    Hey,
    a little question bothering me for a while. I often work with logged dependent variables and do not use any re-transformation and interpret signs mostly. I learned that margins can re-transform variables and now I wonder which method is best or correct. I prepared some:

    Code:
    version 15
    sysuse nlsw88, clear
    gen lwage = log(wage)
    
    reg wage i.union c.ttl_exp
    margins union, at(ttl_exp=(0(4)24))
    marginsplot, name(normal, replace) yscale(range(2 14))
    
    reg lwage i.union c.ttl_exp
    margins union, at(ttl_exp=(0(4)24)) expression(exp(predict(xb)))
    marginsplot, name(exp, replace) yscale(range(2 14))
    
    graph combine normal exp
    The last graph shows that the results are somewhat different. Also the nonlinearity in the second model is interesting. What do you think, which is better and why?

  • #2
    The results are expected to be different, because when you fit with 'wage' your expectation is the arithmetic mean, and when you fit the model with logged wage your expectation changes to geometric mean. Therefore, predictions will not be the same. See the results below:


    Code:
    ameans wage
    
        Variable |    Type             Obs        Mean       [95% Conf. Interval]
    -------------+---------------------------------------------------------------
            wage | Arithmetic        2,246    7.766949        7.528793   8.005105
                 |  Geometric        2,246    6.479511        6.327275    6.63541
                 |   Harmonic        2,246    5.556362        5.429275   5.689541
    -----------------------------------------------------------------------------
    
    reg wage //Not transformed
    ------------------------------------------------------------------------------
            wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
           _cons |   7.766949   .1214451    63.95   0.000     7.528793    8.005105
    ------------------------------------------------------------------------------
    
    reg lwage //Log transforme
    
    ------------------------------------------------------------------------------
           lwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
           _cons |   1.868645    .012124   154.13   0.000      1.84487     1.89242
    ------------------------------------------------------------------------------
    
    di exp(_b[_cons])
    6.4795107
    Whether or not to log transform/any transformation should be guided with justifications backed by theory/practice in the relevant field and several checks. For this wage variable, it seems like quite skewed to the right and log transformation provides more symmetry to the distribution. Again that is not a conclusive check. If residuals are symmetrically distributed I personally do not bother to transform.

    Roman

    Comment


    • #3
      Thanks a lot! I forgot to mention, the whole idea of this thing is that your dep. variable is skewed and you want to transform it to reduce these problems (try estat hettest, the difference between the models is quite large). If I get you correctly, all I have to mention is that the results produced in version 2 with the exponentiated margins are based on the geometric mean, right? So this is clear to the reader.

      Comment


      • #4
        See also https://blog.stata.com/2011/08/22/us...tell-a-friend/ and the references therein. There is a related literature in health economics: search on "Duan smearing estimator"

        Comment

        Working...
        X