margins and log-transformed dependent variable

Marco Tacchi

Join Date: Feb 2019
Posts: 24

margins and log-transformed dependent variable

02 Jan 2024, 07:46

Hello,

My dependent variable is expressed as log(y + 1). This is done because some of my y values are equal to zero. My independent variables are raw scores. It is a panel dataset. I estimate the effects using fixed-effects OLS.

When only the dependent variable is log-transformed, we exponentiate the coefficient to obtain the multiplicative factor for every 1-unit increase in the independent variable x. This is clear.

However, what if I'm willing to understand the effect for a given value of x in the original units of y. Would this procedure be correct? I am using Stata's dataset as an example since I cannot share my data.

I understand that using expression(exp(xb()) - 1) under margins will make the conversion the original values of y, taking into account that 1 was added.

Code:

. use https://www.stata-press.com/data/r18/nlswork.dta, clear
(National Longitudinal Survey of Young Women, 14-24 years old in 1968)

. * Converting log-transformed wage to its original units

. gen wage = exp(ln_wage)

. * Taking the natural log of wage plus 1

. gen ln_wage_plus1 = log(wage + 1)

. xtset idcode year

Panel variable: idcode (unbalanced)
 Time variable: year, 68 to 88, but with gaps
         Delta: 1 unit

. xtreg ln_wage_plus1 i.year tenure union wks_work, fe robust

Fixed-effects (within) regression               Number of obs     =     18,637
Group variable: idcode                          Number of groups  =      4,112

R-squared:                                      Obs per group:
     Within  = 0.1477                                         min =          1
     Between = 0.2128                                         avg =        4.5
     Overall = 0.1622                                         max =         12

                                                F(14, 4111)       =      99.93
corr(u_i, Xb) = 0.1698                          Prob > F          =     0.0000

                             (Std. err. adjusted for 4,112 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
ln_wage_pl~1 | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        year |
         71  |   .0267133   .0085263     3.13   0.002     .0099971    .0434296
         72  |   .0299803   .0098336     3.05   0.002      .010701    .0492595
         73  |   .0290363   .0107442     2.70   0.007     .0079719    .0501007
         77  |   .0623787    .011784     5.29   0.000     .0392757    .0854816
         78  |   .0832801   .0122998     6.77   0.000     .0591659    .1073943
         80  |   .0207636   .0132531     1.57   0.117    -.0052197    .0467468
         82  |   .0334551   .0133158     2.51   0.012      .007349    .0595612
         83  |   .1083472   .0133449     8.12   0.000      .082184    .1345104
         85  |   .0721576   .0142641     5.06   0.000     .0441922     .100123
         87  |    .086793   .0151642     5.72   0.000     .0570629     .116523
         88  |   .1494352   .0155955     9.58   0.000     .1188596    .1800109
             |
      tenure |    .012772    .000996    12.82   0.000     .0108192    .0147248
       union |   .0782881   .0082139     9.53   0.000     .0621845    .0943917
    wks_work |   .0015437   .0001197    12.89   0.000      .001309    .0017784
       _cons |   1.699111   .0113788   149.32   0.000     1.676802    1.721419
-------------+----------------------------------------------------------------
     sigma_u |  .33375129
     sigma_e |  .21308098
         rho |   .7104247   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. summarize wks_work if e(sample)

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
    wks_work |     18,637     63.2597    28.42125          0        104

. margins, at(wks_work = (0(10)100)) expression(exp(xb()) - 1)

Predictive margins                                      Number of obs = 18,637
Model VCE: Robust

Expression: exp(xb()) - 1
1._at:  wks_work =   0
2._at:  wks_work =  10
3._at:  wks_work =  20
4._at:  wks_work =  30
5._at:  wks_work =  40
6._at:  wks_work =  50
7._at:  wks_work =  60
8._at:  wks_work =  70
9._at:  wks_work =  80
10._at: wks_work =  90
11._at: wks_work = 100

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   5.282635    .047974   110.11   0.000     5.188608    5.376663
          2  |   5.380371   .0410873   130.95   0.000     5.299841    5.460901
          3  |   5.479627   .0339773   161.27   0.000     5.413033    5.546221
          4  |   5.580427    .026641   209.47   0.000     5.528212    5.632642
          5  |   5.682795   .0190793   297.85   0.000     5.645401     5.72019
          6  |   5.786756   .0113115   511.58   0.000     5.764586    5.808926
          7  |   5.892334   .0035904  1641.14   0.000     5.885297    5.899371
          8  |   5.999554   .0055614  1078.78   0.000     5.988654    6.010454
          9  |   6.108443   .0139617   437.51   0.000     6.081078    6.135807
         10  |   6.219025   .0227721   273.10   0.000     6.174392    6.263657
         11  |   6.331327   .0318807   198.59   0.000     6.268842    6.393812
------------------------------------------------------------------------------

Tags: None

John Mullahy

Join Date: Dec 2016

Posts: 752
#2

02 Jan 2024, 08:31

If you estimate the specification

Code:

log(y+1) = xb + u

you are perhaps implicitly assuming that E[u|x]=0.

Retransformation will give

Code:

y = exp(xb)*exp(u) - 1

so that

Code:

E[y|x] = exp(xb)*E[exp(u)|x] - 1

However E[u|x]=0 does not imply E[exp(u)|x]=1. So in general using exp(xb)-1 as you've specified will not correctly describe the conditional mean of y.

Edward Norton and I discuss such issues in a recent paper https://onlinelibrary.wiley.com/doi/10.1111/obes.12583 in which we also raise concerns about the use of log(y+1)-type transformations of dependent variables.
7 likes
Comment

Marco Tacchi

Join Date: Feb 2019
Posts: 24

03 Jan 2024, 10:09

Originally posted by John Mullahy View Post

If you estimate the specification

Code:

log(y+1) = xb + u

you are perhaps implicitly assuming that E[u|x]=0.

Retransformation will give

Code:

y = exp(xb)*exp(u) - 1

so that

Code:

E[y|x] = exp(xb)*E[exp(u)|x] - 1

However E[u|x]=0 does not imply E[exp(u)|x]=1. So in general using exp(xb)-1 as you've specified will not correctly describe the conditional mean of y.

Edward Norton and I discuss such issues in a recent paper https://onlinelibrary.wiley.com/doi/10.1111/obes.12583 in which we also raise concerns about the use of log(y+1)-type transformations of dependent variables.

Dear John,

Thank you for your reply and sharing your paper. I will try the alternative estimation methods your study suggests.

Still, if one would use the xtreg, fe robust specification and log(y + 1) as the dependent variable (as described in my initial message), would it be possible to say that expression(exp(xb()) - 1) gives the most accurate conversion to the original values of y? Following my initial example, it seems that the point estimates do not deviate much if simple log(y) is used. This is also the case if the retransformation for the log function using the standard Duan homoskedastic smearing estimate (like in your paper) is used.

Code:

. use https://www.stata-press.com/data/r18/nlswork.dta, clear
(National Longitudinal Survey of Young Women, 14-24 years old in 1968)

. xtset idcode year

Panel variable: idcode (unbalanced)
 Time variable: year, 68 to 88, but with gaps
         Delta: 1 unit

. xtreg ln_wage i.year tenure union wks_work, fe robust

Fixed-effects (within) regression               Number of obs     =     18,637
Group variable: idcode                          Number of groups  =      4,112

R-squared:                                      Obs per group:
     Within  = 0.1435                                         min =          1
     Between = 0.2147                                         avg =        4.5
     Overall = 0.1633                                         max =         12

                                                F(14, 4111)       =      99.61
corr(u_i, Xb) = 0.1738                          Prob > F          =     0.0000

                             (Std. err. adjusted for 4,112 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        year |
         71  |    .032752   .0104792     3.13   0.002     .0122071    .0532969
         72  |   .0356803   .0120706     2.96   0.003     .0120153    .0593453
         73  |   .0342248   .0131951     2.59   0.010     .0083553    .0600944
         77  |   .0745899   .0144043     5.18   0.000     .0463497    .1028301
         78  |   .1014739   .0149435     6.79   0.000     .0721766    .1307712
         80  |   .0238239   .0160676     1.48   0.138    -.0076773    .0553251
         82  |   .0367923   .0161919     2.27   0.023     .0050474    .0685371
         83  |   .1264629   .0161343     7.84   0.000      .094831    .1580948
         85  |   .0790787   .0172319     4.59   0.000     .0452949    .1128625
         87  |   .0947541   .0183261     5.17   0.000      .058825    .1306832
         88  |   .1711229   .0186361     9.18   0.000     .1345859    .2076598
             |
      tenure |   .0147531   .0011576    12.74   0.000     .0124835    .0170226
       union |   .0957137   .0096782     9.89   0.000     .0767392    .1146882
    wks_work |   .0018948   .0001449    13.07   0.000     .0016106    .0021789
       _cons |    1.48285    .013827   107.24   0.000     1.455742    1.509958
-------------+----------------------------------------------------------------
     sigma_u |  .39694574
     sigma_e |  .25262166
         rho |  .71173251   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. margins, at(wks_work = (0(10)100)) expression(exp(xb()))

Predictive margins                                      Number of obs = 18,637
Model VCE: Robust

Expression: exp(xb())
1._at:  wks_work =   0
2._at:  wks_work =  10
3._at:  wks_work =  20
4._at:  wks_work =  30
5._at:  wks_work =  40
6._at:  wks_work =  50
7._at:  wks_work =  60
8._at:  wks_work =  70
9._at:  wks_work =  80
10._at: wks_work =  90
11._at: wks_work = 100

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   5.178712     .04788   108.16   0.000     5.084869    5.272555
          2  |   5.277771   .0411533   128.25   0.000     5.197112     5.35843
          3  |   5.378726    .034155   157.48   0.000     5.311783    5.445668
          4  |   5.481611   .0268803   203.93   0.000     5.428927    5.534295
          5  |   5.586464   .0193292   289.02   0.000      5.54858    5.624349
          6  |   5.693323   .0115257   493.97   0.000     5.670733    5.715913
          7  |   5.802226   .0037896  1531.08   0.000     5.794799    5.809654
          8  |   5.913212   .0057828  1022.54   0.000     5.901878    5.924546
          9  |   6.026321   .0143722   419.30   0.000     5.998152     6.05449
         10  |   6.141594   .0234836   261.53   0.000     6.095567    6.187621
         11  |   6.259072   .0329747   189.81   0.000     6.194442    6.323701
------------------------------------------------------------------------------

Comment

John Mullahy

Join Date: Dec 2016

Posts: 752
#4

03 Jan 2024, 14:21

would it be possible to say that expression(exp(xb()) - 1) gives the most accurate conversion to the original values of y?

I suppose I'm conservative when it comes to making such statements, Marco. So to me an assertion of "most accurate" would be hard to support.

Instead is there any reason you couldn't use expression(exp(xb()) - 1) and then simply report something like what you wrote:

the point estimates do not deviate much if simple log(y) is used. This is also the case if the retransformation for the log function using the standard Duan homoskedastic smearing estimate...is used.

without advancing a claim of "most accurate"?
2 likes
Comment
Marco Tacchi

Join Date: Feb 2019

Posts: 24
#5

03 Jan 2024, 14:55

Originally posted by John Mullahy View Post

I suppose I'm conservative when it comes to making such statements, Marco. So to me an assertion of "most accurate" would be hard to support.

Instead is there any reason you couldn't use expression(exp(xb()) - 1) and then simply report something like what you wrote:

without advancing a claim of "most accurate"?

Thank you, John. Duly noted.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2181
#6

04 Jan 2024, 08:23

You could also use an exponential mean and the Poisson fixed effects estimator to directly get the semi-elasticities. This requires no assumptions of the type John mentioned. If you get similar results, it's another robustness check.

A word of caution about your example: The variable wage never takes the value zero, and so adding one before taking the log is going to be less harmful. If you have lots of zeros in your application then it can have a huge effect -- see Mullahy and Norton!

Also, the estimated effects you obtain are not invariant to how you measure wage. If you change from dollars to cents, say, the estimated percentage effects will change. That's a bad thing. This won't happen with the Poisson FE estimator and an exponential mean.
2 likes
Comment
Marco Tacchi

Join Date: Feb 2019

Posts: 24
#7

08 Jan 2024, 03:55

Originally posted by Jeff Wooldridge View Post

You could also use an exponential mean and the Poisson fixed effects estimator to directly get the semi-elasticities. This requires no assumptions of the type John mentioned. If you get similar results, it's another robustness check.

A word of caution about your example: The variable wage never takes the value zero, and so adding one before taking the log is going to be less harmful. If you have lots of zeros in your application then it can have a huge effect -- see Mullahy and Norton!

Also, the estimated effects you obtain are not invariant to how you measure wage. If you change from dollars to cents, say, the estimated percentage effects will change. That's a bad thing. This won't happen with the Poisson FE estimator and an exponential mean.

Dear Jeff,

Thank you for this additional clarification.

In my raw data (again, sorry for not being able to show the results), the standard deviation of the dependent variable is 3 times higher than the mean. In this case, using the raw data with zeroes, would it be appropriate to use a conditional (xtnbreg y i.year x, fe) or unconditional (nbreg y i.id i.year x) fixed-effects negative binomial instead of the fixed-effects Poisson estimator (xtpoisson y i.year x, fe)?

My fixed-effects OLS results with log-transformed dependent variable plus 1 (xtreg ln_y_plus1 i.year x, fe) are more aligned with either conditional or unconditional negative binomial than with the Poisson estimator.

Would be grateful for your help.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10237
#8

08 Jan 2024, 11:17

the standard deviation of the dependent variable is 3 times higher than the mean

Here, you are examining the distribution of the outcome whereas overdispersion/ underdispersion is a property of the conditional distribution. In any case, the Poisson estimator with -vce(robust)- allows any kind of variance-mean relationship.

In this case, using the raw data with zeroes, would it be appropriate to use a conditional (xtnbreg y i.year x, fe) or unconditional (nbreg y i.id i.year x) fixed-effects negative binomial instead of the fixed-effects Poisson estimator (xtpoisson y i.year x, fe)?

Jeff has compiled a list of reasons why you should almost never use the FE NegBin estimator. See #3 https://www.statalist.org/forums/for...-poisson-model

Last edited by Andrew Musau; 08 Jan 2024, 11:29.
2 likes
Comment

Announcement

margins and log-transformed dependent variable

Comment

Comment

Comment

Comment

Comment

Comment

Comment