Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Test for assumptions of OLS and FE

    Hi,

    I am making a diff-in-diff regression for the top ESG companies relative to the bottom ESG companies in times of COVID-19. And have some questions in regards to the assumptions and tests of these. Please see the following.



    I made the following two regressions (first without fixed effects, and second with fixed effects):

    xtreg RawReturn Top20_ESG Crash Recovery 1.Top20_ESG#1.Crash 1.Top20_ESG#1.Recovery i.GICSectors LN_assets Leverage Liquidity MBV ROA if Not20_ESG != 1, vce(cluster CompanyNo)

    xtreg RawReturn Top20_ESG Crash Recovery 1.Top20_ESG#1.Crash 1.Top20_ESG#1.Recovery i.GICSectors LN_assets Leverage Liquidity MBV ROA if Not20_ESG!= 1, fe vce(cluster CompanyNo)

    I get the following output for the first model:



    For the second model (FE) I get:




    Before to test for the OLS assumptions I have done the following:


    Linearity, Random Sample & Zero Conditional Mean

    I run the following in Stata to test for linearity and zero conditional mean:

    reg RawReturn Top20_ESG Crash Recovery 1.Top20_ESG#1.Crash 1.Top20_ESG#1.Recovery i.GICSectors LN_assets Leverage Liquidity MBV ROA if Not20_ESG != 1
    Why can I only do this with the reg command and not xtreg? And is that fine?

    predict pred, xb replace
    predict resid, resid
    scatter resid pred





    As the above figure does not look like the examples I have seen online I wonder how to interpret the linearity and zero conditional mean from above?

    Could we simply argue for the zero conditional mean that according to XX there should be no omitted variables, hence to avoid this bias, we have included relevant variables recgonized in the literature?

    In regards to the random sample, as we look at S&P 500 and use all of it, I would argue that this fulfills the random sample assumption?

    Multicollinearity
    We test by using correlation and the VIF values.
    corr RawReturn ESG_score E_score S_score G_score LN_assets Leverage Liquidity MBV ROA
    vif

    (VIF model could not be uploaded due to the maximum attachments)
    (correlation matrix could not be uploaded due to the maximum attachments)

    As we see from above no correlation is higher than 0.7 (argued in the literature that there will be some correlation but below 0.7 is fine) and the VIF is below 10 (also argued in the literature). Hence, we see no multicollinearity.

    Heteraskedacity
    From the plot above we can see heteroskedacity from the horizontal lines, correct? However, this can also be tested using the following in Stata:

    reg RawReturn Top20_ESG Crash Recovery 1.Top20_ESG#1.Crash 1.Top20_ESG#1.Recovery i.GICSectors LN_assets Leverage Liquidity MBV ROA if Not20_ESG != 1
    estat hettest

    (Breusch-Pagan test could not be uploaded due to the maximum attachments)
    But the output was
    Ho: Constant variance
    Variables: fitted values of RawReturn
    chi2 = 2289.25
    Prob > chi2 = 0.0000

    I.e., there is heteroskedasticity and we apply the robust standard errors.



    So above is the assumption for the OLS, and thereby random effect model. As we also run the fixed effects model, how do these assumptions differ? As we understand the heteroskedacity would not need to be included but otherwise, it should be the same.



    We run this regression model for the top ESG (as above) but we also run it for a decomponent of only the E score (environmental score). I.e., Top_E instead which will include different companies in the top (same dataset though). In addition, we run it for abnormal returns instead of raw returns.
    In theory, we should run these tests for each model. Correct?



    Thank you so much in advance!!! It is really appreciated.

    Best,
    Guest
    Last edited by sladmin; 10 Jun 2021, 14:50. Reason: anonymize original poster

  • #2
    Dear Guest,
    I have exactly the same problem. I was wondering what is important to test for fixed effects and if you regress first, do the diagnostic tests and then check if you should use fixed or random effects or first check if you use fixed or random and then to some diagnostic tests? I know the xtsktest is for heteroskedasticity in panel data but I think only for random effects?
    Let me know if you know anything further or maybe someone else can help us here
    Last edited by sladmin; 10 Jun 2021, 14:50. Reason: anonymize original poster

    Comment

    Working...
    X