Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why does my residual plot look like this?

    Dear Statalist,

    It is really strange to observe that the residual plot looks like truncated, I can think of no reason to explain it .
    Can someone give me a hint ?

  • #2
    Vito:
    it may be that you have hardly changing values (or constant values) in your regression.
    I would check my dataset.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      hi Mr. Lazzaro, I think I know the reason! Since about 30-40% of my y variable is zero, when I plot residual against y_hat, I am plotting (y-y_hat) against y_hat. For the observations that have zero y, I am actually plotting -y_hat against y_hat!

      Comment


      • #4
        I'm not sure that explanation is correct, or maybe I'm misunderstanding. Just because Y=0 doesn't mean there won't be residual error around that Y-hat 0. You could test your theory by just adding 100 to Y and seeing if that changes anything. Look carefully at every single possible IV/DV relationship using the "graph matrix" command and make sure nothing looks censored or truncated.

        Another thing- if 30-40% of your Y variable is zero and with no negative values and a meaningful zero (e.g. never arrested) you should maybe consider a two step model (heckman, tobit). I'm not sure if your data has truncation or censoring but that might be the case, in which case OLS will be inappropriate.

        Comment


        • #5
          Originally posted by Matt Manierre View Post
          I'm not sure that explanation is correct, or maybe I'm misunderstanding. Just because Y=0 doesn't mean there won't be residual error around that Y-hat 0. You could test your theory by just adding 100 to Y and seeing if that changes anything. Look carefully at every single possible IV/DV relationship using the "graph matrix" command and make sure nothing looks censored or truncated.

          Another thing- if 30-40% of your Y variable is zero and with no negative values and a meaningful zero (e.g. never arrested) you should maybe consider a two step model (heckman, tobit). I'm not sure if your data has truncation or censoring but that might be the case, in which case OLS will be inappropriate.
          Hi, Matt,thank you for answering my question. Since I use OLS to estimate a linear model, it is true that there will be residuals when Y = 0, but the residuals equal to (-)Y_hat under this situation, therefore plotting x against (-)x get a straight line with a slope of -1. I think my answer is right, can you tell me more about your idea ? If I add 100 to every Yi, it does not change anything.

          My Y variable is event counts, a Poisson model is preferred then.

          Comment


          • #6
            If your variable is count them you should definitely go with the Poisson model, and the plot you're showing seems to be ample evidence to motivate that.

            Comment

            Working...
            X