How to solve missing value when I gen log variable?

Peter Mirror

Join Date: May 2015

Posts: 5
#1

How to solve missing value when I gen log variable?

08 May 2015, 07:39

Y=SET,yhat = linear prediction
In graph picture, yhat lost some value because I used log regression. And,some of my data is negative value.
How can I solve it?
Thank in advance
Attached Files
Tags: None
FernandoRios

Join Date: Apr 2014

Posts: 2469
#2

08 May 2015, 07:50

Hi Peter.
Two words of advice. First, you cant really solve the missing value problem when logging a variable, if the variable is originally negative. Even if its zero, you cant properly do so. Keep in mind as the original value goes to zero, the log goes to infinity. So some suggests to just add a very small number instead of zero.
You can try to add a big enough number to your original variable, but that also creates some issues for the lower tail of the distribution. And as i saw in some cases, you can use a different transformation to deal with that (cubic root for example)
Second, unless there is a very particular reason why you want to take the logs of Growth, I wouldnt do it. My personal rule is to take logs of overly disperse data. And data that is in "levels". But in this case, its growth, so its not in "levels" is in percentage change.
Hope this Helps
Fernando
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#3

08 May 2015, 08:00

A generalised linear model with logarithmic link might help as the assumption there is only that the mean response remains positive, not that all response values are positive.

On the whole, however, looking at log of GDP growth rate seems to be quite wrong in principle. The implication that the difference between 0.001% and 0.01% is as big a deal as that between 1% and 10% (or even between 0.1% and 1%) seems inappropriate economically as well as statistically, and I don't even write as an economist.... As your observed values span zero, that's in effect what you are implying.
Comment
Peter Mirror

Join Date: May 2015

Posts: 5
#4

08 May 2015, 09:30

Originally posted by FernandoRios View Post

Hi Peter.
Two words of advice. First, you cant really solve the missing value problem when logging a variable, if the variable is originally negative. Even if its zero, you cant properly do so. Keep in mind as the original value goes to zero, the log goes to infinity. So some suggests to just add a very small number instead of zero.
You can try to add a big enough number to your original variable, but that also creates some issues for the lower tail of the distribution. And as i saw in some cases, you can use a different transformation to deal with that (cubic root for example)
Second, unless there is a very particular reason why you want to take the logs of Growth, I wouldnt do it. My personal rule is to take logs of overly disperse data. And data that is in "levels". But in this case, its growth, so its not in "levels" is in percentage change.
Hope this Helps
Fernando

My model is SET[Thailand Stock index] = b1 + b2Profit[Earning of the market] + b3GDPgrowth[%] + b4Netbuysell[Net value of foreign trading in Thailand]
+ b4WTI[West texes Intermediate Price]
In the picture,I think Profit and Netbuysell are very high value to compare with SET .So,I use log in Profit and Netbuysell. I'm sorry I did't use GDPgrowth.I just make an example.

Then,I run reg. R squared is around 0.9 but some value missing
So, I'm not sure .

Attached Files
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#5

08 May 2015, 09:35

The main argument is the same with logging predictors. If the value could ever be zero or negative, the logarithm as a transformation is somewhere between dubious and absurd. Even if all the values happen to be positive, there is still a risk of over-transformation.
Comment
Peter Mirror

Join Date: May 2015

Posts: 5
#6

08 May 2015, 11:27

Thank you very much
Comment

Announcement

How to solve missing value when I gen log variable?

Comment

Comment

Comment

Comment

Comment