Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems with normality of residuals and skewness, panel data, fixed effects (Can I trust the tests?)

    Hello!
    I'm attempting to make some prediction on the relationship between life expectancy and (mainly) access to improved water sources and sanitation facilities.
    I am using panel data and the dataset consists of 69 cross sectional observations (countries) and 8 time series observations (years; 2007-2014) on lifeexpectancy, wateraccess, sanitationaccess, totalhealth and ruralpop, for a total of 552 pooled observations. The countries are middle-low and low income countries.
    • Descriptive statistics:
    Click image for larger version

Name:	xtsum.PNG
Views:	1
Size:	24.3 KB
ID:	1364870



    Click image for larger version

Name:	sktest res xtreg.png
Views:	1
Size:	30.2 KB
ID:	1364874
    • Predicting the overall residual after -xtreg- (-predict res, e)
    • -kdensity res-
    • -sktest res-
    • -qnorm res-
    Click image for larger version

Name:	kdensitysktestresqnorm.png
Views:	1
Size:	17.2 KB
ID:	1364873



    I've noticed that the variation in the residuals seem to decrease as the independent variables increase, for example:
    • -reg $ylist $xlist-
    • -rvfplot, mlabel($id)-
    Click image for larger version

Name:	fittedvalues.PNG
Views:	1
Size:	31.3 KB
ID:	1364872


    And so I thought perhaps transforming variables would be the right way to go, but for example:
    • -ladder wateraccess-
    Click image for larger version

Name:	ladder.PNG
Views:	1
Size:	7.1 KB
ID:	1364871


    And it's the same for -ladder sanitationaccess- etc.
    So basically what I'm wondering is if there is anything I can do, or should I just disregard the results of the -xtreg- as they're seemingly untrustworthy?
    I'm clustering the standard errors on the country level, if that matters.

    If you made it this far, thanks for the attention, any and all help is greatly appreciated. If there is anything I can provide which may make things more clear let me know.

  • #2
    Oyvind:
    -vce(robust)- (which works the same as -vce(cluster clusterid)- under -xtreg-) takes both heteroskedasticity and autocorrelation into account.
    Hence, if -fe- is the right specification for your panel dataset, I would trust its outcome.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      I first apply the Breusch-Pagan LM test and there is strong evidence of significant differences across countries, i.e., highly significant chi-square value, p = .0000. And so I reject the null
      and conclude that simple OLS regression is not appropriate. I then use the Hausman test, again I reject the null (highly significant chi-square value, p = .0000).
      And that's basically how I conclude that a fixed effects model is the appropriate model.

      Are you saying that if the estimation method is correctly specified, and the appropriate method is fixed effects, and I use the robust standard errors, which as you said takes
      both heteroskedasticity and autocorrelation into account, then it is likely that I can trust the outcome *even though* the residuals are not normally distributed?

      Thank you so much for the respons btw, I've lost a lot of sleep over this, and I'm having a hard time getting the right answers from my proffessor, even though that's probably due to my phrasing of questions..!

      Comment


      • #4
        Oyvind:
        I meant, as you got it, that:
        -the appropriate method is fixed effects with robustified standard errors;
        - you can trust the outcome of your panel data regession.
        Kind regards,
        Carlo
        (Stata 18.0 SE)

        Comment


        • #5
          Excellent, thank you very much!

          Comment

          Working...
          X