Marginal Effects in Probit model for a Log-Transformed Variable

Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#16

24 Mar 2019, 10:47

I initially use a linear probability model and the coefficient on ln(income) is 0.00875. I have interpreted this as: the probability of y=1 associated with a 1% increase in income is a 0.0000875% point increase (basically no effect)

This is, for practical purposes, correct. It is actually a commonly used approximation, and the actual value is 0.00875*log(1.01) = 0.00008707 increase in the outcome variable. Assuming the outcome variable is, itself, a percentage, then that is 0.00008707 percentage points. Evidently the difference between 0.0000875 and 0.00008707 is negligible in almost any real world context, validating the use of the common approximation.

The marginal effect at means on the probit model on ln(income) is 0.00907. I have interpreted this as: the probability of y=1 associated with a 172% increase in income is a 0.00907% point increase.
Therefore, the probability of y=1 associated with a 1% increase in income is a 0.00907/172= 0.000053% point increase (basically no effect).

This is not correct. ln(income) is, presumably, a continuous variable in your model. The marginal effect is therefore a (partial) derivative of the outcome with respect to ln(income). So, the instantaneous rate at which the probability of your outcome increase is 0.00907 per unit increase in ln(income). But due to the non-linearity of the probit model, the conclusions you have drawn do not follow from that. The marginal effect is like the reading of a speedometer in a car. Just because the speedometer says that I'm driving at 50 km/hr, it does not follow that I will go 50 km in the next hour, because my speed may vary. And in these non-linear models, the "speed" actually does vary. A 172% increase in income is a large change, and the marginal effect at the mean will not be representative of the marginal effects at values between the mean and the mean + 172%. And depending on lots of things, it could be close, or way off in either direction. So it is better to apply the marginal rate only over very small changes (which is taking the nature of the marginal effect as a partial derivative seriously). So consider a 1% increase in income. Income goes up by a factor of 1.01, which means ln(income) increase by ln(1.01) = 0.00995033. So the expected outcome probability will go up by 0.00907*log(1.01) = 0.00009025, which is 0.009025 percentage points.

As for whether these differences in predicted probability are small enough to say that the effect is negligible, that is a judgment that must be made taking into account the context and the meaning of the variables. It is a judgment call to be made by somebody familiar with the content area in your field--it is not a statistical question. In any case, even if the difference is small enough to be called negligible, or unimportant, or "for all practical purposes, no effect," I would never say it is "no effect."
Comment
Javier Gutierrez

Join Date: May 2018

Posts: 14
#17

18 Aug 2019, 11:13

Originally posted by Clyde Schechter View Post

Suppose your outcome variable is y, and your predictor variable is x, but, for whatever reason, you choose to use log x as the predictor in the model and run this:

Code:

gen log_x = log(x) probit y log_x other_variables margins, dydx(*) atmeans

And suppose the margin for log_x is 0.0729.

This means that a difference of 1 in log x (not 1%, nor 1 percentage point: logarithms are dimensionless) is associated with an increase of 0.0729 in the probability of y = 1. So, if the "baseline" probability is, say 0.05, an increase of 1 in log x is associated to an expected probability of 0.1229. Note that a difference of 1 in log x, when viewed from the perspective of x itself, means x being multiplied by 2.71828..., which is a roughly 172% increase in x.

Dear Clyde,

Would that be the interpretation for Marginal Effects at Means (MEM) or also for Average Marginal Effects (AME)? For example if I run:

Code:

margins, dydx(*) post predict(pr)

I am bit confused, if you could please help me clear it out. Thanks a lot!

Last edited by Javier Gutierrez; 18 Aug 2019, 11:20.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#18

18 Aug 2019, 11:50

This post is only very tangentially related to the topic of this thread. It is important not to change subjects during a thread. These are not dialogs between two people. People come and search the forum for answers based on the thread topics. Somebody with a problem similar to the original topic of this thread will waste his or her time reading your post and my response. Somebody with a problem similar to yours will be unable to find it. So in the future, please do not add to an existing thread unless your question is closely related to the topic of that thread. If in doubt, start in a new thread.

-margins, dydx(*)- (the other options are irrelevant for the present discussion) calculates average marginal effects. For sake of discussion let's assume that the outcome variable is y and there is only one predictor, x. Stata proceeds to calculate the marginal effect of x on y at the observed value of x in each observation. If x is continuous, this means calculating the first derivative of y with respect to x at each value of x. If x is discrete it means calculating E(y | x = observed x + 1) - E(y | x = observed x). In either case, all of these individual marginal effects are then averaged, and the average is reported.

-margins, dydx(*) atmeans- calculates marginal effects at means. In this case, the actual observed values of x are ignored and, in every observation, they are replaced by the average value of x. Now the marginal effect of x on y at that mean value is calculated, and those results (which in the simple case presented here are all exactly the same) are averaged and the average is reported.

With linear models, the results are the same for AME and MEM. But with non-linear models they will differ.
Comment
Javier Gutierrez

Join Date: May 2018

Posts: 14
#19

18 Aug 2019, 13:06

Dear Clyde,

Thanks for your answer. Sorry for the confusion, I should have explained my question better.
My question, I believe, was related to the original question of the thread about how to interpret Marginal Effects in Probit model for a Log-Transformed Variable for Marginal Effects at Means (MEM) and for Average Marginal Effects (AME).

As I understood, you explained how margin for log_x would be interpreted for Marginal Effects at Means (roughly 172% increase in x). And I was wondering how would margin for log_x be interpreted for Average Marginal Effects. I know that AME and MEM are different and they yield different results. I was just wondering about their interpretation for a log-transformed variable. For example, if I run:

Code:

gen log_x = log(x) probit y log_x other_variables margins, dydx(*) post predict(pr) *** post predict(pr) instead of atmeans

how would margin for log_x be interpreted? Thanks a lot again!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#20

18 Aug 2019, 14:04

So, whether the predictor variable is the logarithm of something else or not really makes no difference. The process is the same: in each observation, dy/d(log_x) is calculated (the discrete case presumably does not apply here), and the average of these values across all observations is computed and reported. So what does this mean? It means that the rate of change in y per unit change in log_x is, on average across the sample, the reported result. Now, if you wish to translate this into x-effects instead of log_x effects, we just note that a unit change of log_x is equivalent to a multiplicative change of x by a factor of e (= 2.71828...). So one could also say it as the average (over the sample) rate of change in y per e-fold multiplicative change in x is the reported value.
2 likes
Comment
Javier Gutierrez

Join Date: May 2018

Posts: 14
#21

19 Aug 2019, 15:46

Dear Clyde, thanks for your answer. I was just confused because of the #2 answer from prof. Wooldridge here https://www.statalist.org/forums/for...ogged-variable. But it's clear for me now.
Comment
ahmadreza najarzadeh

Join Date: Apr 2020

Posts: 1
#22

02 Apr 2020, 10:28

Originally posted by Clyde Schechter View Post

Not sure why #1 never got an answer, at least not publicly.

Suppose your outcome variable is y, and your predictor variable is x, but, for whatever reason, you choose to use log x as the predictor in the model and run this:

Code:

gen log_x = log(x) probit y log_x other_variables margins, dydx(*) atmeans

And suppose the margin for log_x is 0.0729.

This means that a difference of 1 in log x (not 1%, nor 1 percentage point: logarithms are dimensionless) is associated with an increase of 0.0729 in the probability of y = 1. So, if the "baseline" probability is, say 0.05, an increase of 1 in log x is associated to an expected probability of 0.1229. Note that a difference of 1 in log x, when viewed from the perspective of x itself, means x being multiplied by 2.71828..., which is a roughly 172% increase in x.

hi, i have a similar result in here i have my coefficient of log variable in probit model to be .2662541 and when i used the command margins, dydx(log_x) atmeans i get .0661809, can you please help explain because i am quite confused with the interpretation
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#23

02 Apr 2020, 13:36

So, holding all other variables constant at their sample mean values, a unit difference in log_x (which corresponds to x differing by a factor of e = 2.718...) is associated with a difference of 0.066... (6.6% if you prefer percentages) in the probability of y being true.
1 like
Comment
Heena Tariq

Join Date: Aug 2020

Posts: 5
#24

10 Sep 2020, 01:49

Hey,

I run a double hurdle model (more specifically, the 'churdle' command) and use a probit model as my first step and a truncated regression as my next step on complex survey data.
I then follow by calculating the margins:

Code:

gen ln_s2bq19i = ln(s2bq19i) replace ln_s2bq19i=0 if ln_s2bq19i==. global ylist ln_s2bq19i global xlist male ln_hhsize father_pri father_sec father_tri father global slist male ln_hhsize ln_percapitaexp father_pri father_sec father_tri father_deg svy, subpop(if (inrange(age,5,9))) : churdle linear $ylist $xlist, select($slist) ll(0) margins, dydx(_all) post

In my case, the dependent variable is log-transformed. If the average marginal effect on one variable is 0.149 (0.0420 s.e), how should that be interpreted? How would the interpretation vary for dummy variables and continuous variables?
Any guidance would be appreciated.
Comment
Lucy Porter

Join Date: Jan 2021

Posts: 6
#25

06 Jan 2021, 06:07

Hi, I have a similar issue:

I am using the log of income as an independent variable in a probit regression. I am unsure how to interpret the AME coefficient of -0.39208 for the log transformed variable.

In OLS, logs are interpreted as a (%) increase of log_x, therefore is it still a percentage increase in log variable for probit regression, and how exactly can the coefficient be interpreted? Thank you!
Comment
John Mullahy

Join Date: Dec 2016

Posts: 751
#26

06 Jan 2021, 09:25

If you let p(x) denote the estimate of Prob(y=1|x) where x includes log-income, then for a single observation the marginal effect is

dp(x)/d(log-income) = (income)*dp(x)/d(income)

since dln(z)=dz/z (standard calculus result).

The AME is the average of this quantity over the sample. One way to interpret this is the average estimated change in Prob(y=1|x) due to a percent change ("proportional change" is another way to say it) in income holding constant all the other x's.

Last edited by John Mullahy; 06 Jan 2021, 09:29.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2168
#27

06 Jan 2021, 16:24

To follow up on John's post, you can always take the AME for a continuous variable and multiply it by a "small" change in the variable. It shouldn't be too large because then the calculus approximation does not work well. In the case of log(x), I like to increase it by 0.10, as that corresponds to about a 10% increase in x. When x = income, that means a 10% increase in income. So, with AME = -0.392, the change in the probability is -0.0392. So, on average, a 10% increase in income reduces the probability that y = 1 by about .039. Since I don't know what y is in Lucy's case I can't comment on whether it seems sensible.
Comment
Lucy Porter

Join Date: Jan 2021

Posts: 6
#28

09 Jan 2021, 10:09

Thanks so much for the help, that makes much more sense. Jeff, my dependent variable is a total difficulty score for socio-emotional behaviour in adolescents (from Strengths and Difficulties Questionnaire), so I think that what you have recommended could be applied in this case?
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2168
#29

09 Jan 2021, 13:07

Lucy: Is that actually a binary outcome? Or is it a fraction (or proportion)?
Comment
Lucy Porter

Join Date: Jan 2021

Posts: 6
#30

10 Jan 2021, 06:09

Hi Jeff, yes I have transformed the continuous variable into the 75th percentile and above. so binary is 75th percentile and above=1 and the rest =0
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment