Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Predict residuals and calculate RMSE in Poisson

    Hi all,

    I have been trying to calculate the RMSE after poisson with sereval methods, also after the GLM Version the res Option didn't give me the right result, so I think I'm missing something essential here.

    1) I know now that poisson doesn't support the Option "predict resid, response" --> "Option residuals not allows"
    2) I tried then after reading some other comments on the Internet:

    Code:
     glm flow $xlist dindep* dyear*, fam(poisson) link(log) vce (cluster $id)
    predict resid, res
    gen resid2 = resid^2
    sum resid
    di sqrt(1297812*(945873/945873-945741))
    1139.295
    which gives the same result as:

    Code:
    poisson flow $xlist dindep* dyear*, vce (cluster $id)
    predict predicted
    gen residual = flow - predicted
    gen residual2=residual^2
    sum residual2
    (or alternatively predict resid, score)
    The value it gives me is over 1100 whereas OLS (reg lnflow $xlist dindep* dyear*, vce (cluster $id)) calculates an RMSE of 1.888. In the Output I'm reproducing this should be around 1.9, too.

    I read somewhere that the glm function is used for un-paneled date but I'm using unbalanced panel data for analyzing trade flows. Could this matter?
    Any thoughts on this?

    Please tell me if you need more Information.
    Thank you very much!
    Karin

  • #2
    Your regression using lnflow is presumably for ln flow as response and so is necessarily associated with a RMSE on the natural logarithmic scale.

    In contrast calculations after any flavour of Poisson regression should return an RMSE on the raw or original scale. Its use of a logarithmic link is consistent with the fact that predictions are returned on that scale.

    So, very different results are entirely to be expected.

    We can't check anything with your dataset as we don't have access to it.

    Comment


    • #3
      Thank you Nick for your reply. Do I understand this right, your saying that we can't compare those results of RMSE as in OLS I first take the log of flow and compare those with lnflowhat and with poisson I use data which wasn't transformed in the first place? Do you see any way to make this comparable to the OLS one?

      First I thought of "gen lnyhat = log(yhat)" after the poisson regression and compare those but obviously this changes the data massively and is making no sense.

      Sorry if this seems obvious but I just wanted to clarify.

      Comment


      • #4
        I would tend to exponentiate predictions from the model with a log transformed response and compare that way. But what you describe does not change the data and it does make sense. The only question is how useful it is for your purposes. RMSE is definable in general but just needs some care. It is entirely possible that it is dominated by the largest residuals, so always keep an eye on the entire distribution of residuals too.

        Comment

        Working...
        X