Good evening,
For my thesis I would like to examine whether there is a relationship between Corporate environmental performance and corporate financial performance for European countries
Therefore, I have gathered panel data of 516 firms (id) over a period of 6 years across European countries. (micro economic panel data)
- I will use the environmental pillar score as a proxy for Environmental performance
- Furthermore, by looking at other studies, they made use of different control variables (Size of the firm, CurrentRatio, Cashflow, R&D expenses, Market share, Capital…). In total 8 control variables are used.
- Furthermore I would want to measure if there is any difference within European countries, by dividing these countries into 4 regions (North, East, South and West), i added this to my data by adding a dummy variable i.Region
- The dependent variable is financial performance, measured as two different variables (TobinsQ and ROA)
Next i checked my data for outliers on all variables…to overcome these outliers I winsorized the variables (at 1% and 99%)
- I inserted data in long form in Stata
- I created paneldata by xtset (id Year)
- Controlled the xtsum (more between difference, than within…however I want to examine if companies that increase their environmental impact, could increase their financial performance, thus within is more important for me)
- I controlled for multicollinearity (no multicollinearity), also checked this by examining VIF, both indicated no presence of multicollinearity
- By applying the Breusch Pagan Lagrange Multiplier test, it indicated that my data was panel data…however, does this mean OLS is not suitable anymore?
- By applying the Modified Wald test, an indication was given that the residuals were heteroskedastic
- Next, I used the Pesaran test to control if there was cross-sectional dependence (cotemporaneous correlation), i had to reject the null hypotheses, indicating that there was cross-sectional dependence
- By applying the Wooldridge test, I controlled for serial correlation (first order autocorrelation), I had to reject the null hypothesis, what indicated presence of serial correlation
- Lastly, by applying the Hausman, it indicated choosing for the FE model, however, since i would like to examine the effect of European regions, FE doesn't sound suitable for this? (since the Regions would be omitted when using a FE regression). Note that the regions don't change within firms
Concluding,
- There is no multicollinearity present
- Breusch Pagan LM test indicated panel data
- There is heteroscedasticity present in the residuals
- Pesaran test indicated cross-sectional dependence
- Wooldridge test indicated serial correlation present
>> Therefore OLS, FE, RE don't seem suitable for me since, this would give biased estimations (could someone confirm this?)
The problem for me at the moment is deciding how to go on from here
I've read many STATA discussions but don't seem to come to a clear answer of what I should do.
Since N > T (516 > 6), both XGLS and PCSE don't seem suitable for me (could someone confirm this?)
Another study, which is strongly related, suggested adjusting standard errors for clustering by both firm(id) and year (this is often referred to as Petersen's approach), could this method be a possible solution? (I would do therefore use either reghdfe or ivreg2) (could someone confirm this?)
regdhdfe Y X1 (Dependent variable) X2…X19 (Control variables) i.Region, noabsorb cluster(id Year)
ivreg2 Y X (Dependent variable) X2…X19 (Control variables) i.Region, cluster(id Year)
I hope someone can help me,
Kind regards!
For my thesis I would like to examine whether there is a relationship between Corporate environmental performance and corporate financial performance for European countries
Therefore, I have gathered panel data of 516 firms (id) over a period of 6 years across European countries. (micro economic panel data)
- I will use the environmental pillar score as a proxy for Environmental performance
- Furthermore, by looking at other studies, they made use of different control variables (Size of the firm, CurrentRatio, Cashflow, R&D expenses, Market share, Capital…). In total 8 control variables are used.
- Furthermore I would want to measure if there is any difference within European countries, by dividing these countries into 4 regions (North, East, South and West), i added this to my data by adding a dummy variable i.Region
- The dependent variable is financial performance, measured as two different variables (TobinsQ and ROA)
Next i checked my data for outliers on all variables…to overcome these outliers I winsorized the variables (at 1% and 99%)
- I inserted data in long form in Stata
- I created paneldata by xtset (id Year)
- Controlled the xtsum (more between difference, than within…however I want to examine if companies that increase their environmental impact, could increase their financial performance, thus within is more important for me)
- I controlled for multicollinearity (no multicollinearity), also checked this by examining VIF, both indicated no presence of multicollinearity
- By applying the Breusch Pagan Lagrange Multiplier test, it indicated that my data was panel data…however, does this mean OLS is not suitable anymore?
- By applying the Modified Wald test, an indication was given that the residuals were heteroskedastic
- Next, I used the Pesaran test to control if there was cross-sectional dependence (cotemporaneous correlation), i had to reject the null hypotheses, indicating that there was cross-sectional dependence
- By applying the Wooldridge test, I controlled for serial correlation (first order autocorrelation), I had to reject the null hypothesis, what indicated presence of serial correlation
- Lastly, by applying the Hausman, it indicated choosing for the FE model, however, since i would like to examine the effect of European regions, FE doesn't sound suitable for this? (since the Regions would be omitted when using a FE regression). Note that the regions don't change within firms
Concluding,
- There is no multicollinearity present
- Breusch Pagan LM test indicated panel data
- There is heteroscedasticity present in the residuals
- Pesaran test indicated cross-sectional dependence
- Wooldridge test indicated serial correlation present
>> Therefore OLS, FE, RE don't seem suitable for me since, this would give biased estimations (could someone confirm this?)
The problem for me at the moment is deciding how to go on from here
I've read many STATA discussions but don't seem to come to a clear answer of what I should do.
Since N > T (516 > 6), both XGLS and PCSE don't seem suitable for me (could someone confirm this?)
Another study, which is strongly related, suggested adjusting standard errors for clustering by both firm(id) and year (this is often referred to as Petersen's approach), could this method be a possible solution? (I would do therefore use either reghdfe or ivreg2) (could someone confirm this?)
regdhdfe Y X1 (Dependent variable) X2…X19 (Control variables) i.Region, noabsorb cluster(id Year)
ivreg2 Y X (Dependent variable) X2…X19 (Control variables) i.Region, cluster(id Year)
I hope someone can help me,
Kind regards!
Comment