Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to solve missing value when I gen log variable?

    Y=SET,yhat = linear prediction
    In graph picture, yhat lost some value because I used log regression. And,some of my data is negative value.
    How can I solve it?
    Thank in advance
    Attached Files

  • #2
    Hi Peter.
    Two words of advice. First, you cant really solve the missing value problem when logging a variable, if the variable is originally negative. Even if its zero, you cant properly do so. Keep in mind as the original value goes to zero, the log goes to infinity. So some suggests to just add a very small number instead of zero.
    You can try to add a big enough number to your original variable, but that also creates some issues for the lower tail of the distribution. And as i saw in some cases, you can use a different transformation to deal with that (cubic root for example)
    Second, unless there is a very particular reason why you want to take the logs of Growth, I wouldnt do it. My personal rule is to take logs of overly disperse data. And data that is in "levels". But in this case, its growth, so its not in "levels" is in percentage change.
    Hope this Helps
    Fernando

    Comment


    • #3
      A generalised linear model with logarithmic link might help as the assumption there is only that the mean response remains positive, not that all response values are positive.

      On the whole, however, looking at log of GDP growth rate seems to be quite wrong in principle. The implication that the difference between 0.001% and 0.01% is as big a deal as that between 1% and 10% (or even between 0.1% and 1%) seems inappropriate economically as well as statistically, and I don't even write as an economist.... As your observed values span zero, that's in effect what you are implying.

      Comment


      • #4
        Originally posted by FernandoRios View Post
        Hi Peter.
        Two words of advice. First, you cant really solve the missing value problem when logging a variable, if the variable is originally negative. Even if its zero, you cant properly do so. Keep in mind as the original value goes to zero, the log goes to infinity. So some suggests to just add a very small number instead of zero.
        You can try to add a big enough number to your original variable, but that also creates some issues for the lower tail of the distribution. And as i saw in some cases, you can use a different transformation to deal with that (cubic root for example)
        Second, unless there is a very particular reason why you want to take the logs of Growth, I wouldnt do it. My personal rule is to take logs of overly disperse data. And data that is in "levels". But in this case, its growth, so its not in "levels" is in percentage change.
        Hope this Helps
        Fernando

        My model is SET[Thailand Stock index] = b1 + b2Profit[Earning of the market] + b3GDPgrowth[%] + b4Netbuysell[Net value of foreign trading in Thailand]
        + b4WTI[West texes Intermediate Price]
        In the picture,I think Profit and Netbuysell are very high value to compare with SET .So,I use log in Profit and Netbuysell. I'm sorry I did't use GDPgrowth.I just make an example.

        Then,I run reg. R squared is around 0.9 but some value missing
        So, I'm not sure .

        Attached Files

        Comment


        • #5
          The main argument is the same with logging predictors. If the value could ever be zero or negative, the logarithm as a transformation is somewhere between dubious and absurd. Even if all the values happen to be positive, there is still a risk of over-transformation.

          Comment


          • #6
            Thank you very much

            Comment

            Working...
            X