Hello, I am working on my undergraduate dissertation and am looking at the effects of ESG on financial performance for US energy firms, running pooled OLS, fixed effects and Arellano bond estimators.
Currently, I am struggling to decide the amount of years to explore. I had originally decided to explore a period of 5 years after discussing with my supervisor and being warned about how unbalanced panel data could bias dynamic panel estimators such as Arellano Bond, but after reading the paper Baltagi, B.H. and Chang, Y.-J. (1994) ‘Incomplete panels’, Journal of Econometrics, 62(2), pp. 67–89. doi:10.1016/0304-4076(94)90017-5. , which finds attempting to make the data balanced by dropping observations worsens the performance of estimators compared to using the entire unbalanced data set.
Also, I am planning to run all my regressions with 2 specifications, one including the control variable "R&D intensity" and one without, as there are many missing observations (RD_TR in summary statistics table), however, it is an important variable used in past literature, and I will be mentioning this as a limitation of my study, is this fine?
Furthermore, when testing for heteroskedasticity for my models, I use xttest3 in Stata 17.0, however, I find no difference after using vce(robust) and am unsure as to why. Here I am using data from 2018-2023 and have dropped all firms which did not report their CO2 emissions.
Here is a summary statistic table for before and after dropping observations based on the 2 criteria mentioned above:
Currently, I am struggling to decide the amount of years to explore. I had originally decided to explore a period of 5 years after discussing with my supervisor and being warned about how unbalanced panel data could bias dynamic panel estimators such as Arellano Bond, but after reading the paper Baltagi, B.H. and Chang, Y.-J. (1994) ‘Incomplete panels’, Journal of Econometrics, 62(2), pp. 67–89. doi:10.1016/0304-4076(94)90017-5. , which finds attempting to make the data balanced by dropping observations worsens the performance of estimators compared to using the entire unbalanced data set.
Code:
tabulate year has_esg
| has_esg
year | 0 1 | Total
-----------+----------------------+----------
2004 | 87 18 | 105
2005 | 90 27 | 117
2006 | 96 32 | 128
2007 | 100 33 | 133
2008 | 99 39 | 138
2009 | 97 44 | 141
2010 | 89 61 | 150
2011 | 98 67 | 165
2012 | 98 74 | 172
2013 | 104 77 | 181
2014 | 108 81 | 189
2015 | 107 86 | 193
2016 | 111 94 | 205
2017 | 101 116 | 217
2018 | 71 155 | 226
2019 | 60 170 | 230
2020 | 51 184 | 235
2021 | 27 216 | 243
2022 | 10 236 | 246
2023 | 0 247 | 247
-----------+----------------------+----------
Total | 1,604 2,057 | 3,661
Code:
tabulate year has_RD
| has_RD
year | 0 1 | Total
-----------+----------------------+----------
2004 | 76 29 | 105
2005 | 85 32 | 117
2006 | 94 34 | 128
2007 | 98 35 | 133
2008 | 100 38 | 138
2009 | 96 45 | 141
2010 | 105 45 | 150
2011 | 111 54 | 165
2012 | 118 54 | 172
2013 | 125 56 | 181
2014 | 130 59 | 189
2015 | 132 61 | 193
2016 | 141 64 | 205
2017 | 152 65 | 217
2018 | 156 70 | 226
2019 | 160 70 | 230
2020 | 164 71 | 235
2021 | 166 77 | 243
2022 | 168 78 | 246
2023 | 169 78 | 247
-----------+----------------------+----------
Total | 2,546 1,115 | 3,661
Here is a summary statistic table for before and after dropping observations based on the 2 criteria mentioned above:
Code:
summarize TOBIN ROA ESGC ENV SOC GOV lnCO2 EMIS Target Prod RU Policy RD_TR DE BETA SIZE age
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
TOBIN | 3,182 .9839965 .0682519 .2757434 2.460593
ROA | 1,844 .0270842 .1612215 -4.7646 .99519
ESGC | 2,057 36.47043 18.90526 .9054831 88.83961
ENV | 2,057 29.20133 26.18804 0 96.92313
SOC | 2,057 36.77819 22.12286 .4434122 94.85254
-------------+---------------------------------------------------------
GOV | 2,057 52.02294 23.23856 .2800454 98.42676
lnCO2 | 996 14.13439 2.451057 1.699279 18.88316
EMIS | 2,057 37.09499 31.56243 0 99.68553
Target | 1,882 20.42819 35.29809 0 95.91837
Prod | 3,661 .5605026 .4963937 0 1
-------------+---------------------------------------------------------
RU | 2,057 31.88244 31.62478 0 99.79839
Policy | 3,661 .7437859 .4366011 0 1
RD_TR | 1,115 .8232445 12.81121 -.00055 339.7368
DE | 3,104 51.49517 29.2946 3 100
BETA | 1,698 1.710485 1.005974 -3.574506 7.031454
-------------+---------------------------------------------------------
SIZE | 3,497 21.17943 2.214364 6.907755 26.63424
age | 3,601 17.14524 18.58871 0 141
. drop if has_lnCO2 == 0
(2,665 observations deleted)
. keep if year >=2018
(336 observations deleted)
. summarize TOBIN ROA ESGC ENV SOC GOV lnCO2 EMIS Target Prod RU Policy RD_TR DE BETA SIZE age
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
TOBIN | 655 .9682978 .0600428 .4543625 1.213569
ROA | 451 .0341375 .0863776 -.5233 .4139
ESGC | 660 49.2028 16.77136 9.576389 88.83961
ENV | 660 46.56343 22.21136 .2285714 96.92313
SOC | 660 49.90482 21.10299 6.186469 94.84564
-------------+---------------------------------------------------------
GOV | 660 59.54962 22.04239 .2800454 96.5379
lnCO2 | 660 13.56501 2.529294 1.699279 18.5429
EMIS | 660 59.99171 24.13653 0 99.0625
Target | 647 39.07833 39.55785 0 93.89313
Prod | 660 .3075758 .4618399 0 1
-------------+---------------------------------------------------------
RU | 660 51.82752 27.55294 0 99.79839
Policy | 660 .8651515 .3418207 0 1
RD_TR | 259 .0323892 .1052471 .0000315 1.066443
DE | 646 53.43344 25.08295 3 100
BETA | 637 1.884157 .9794677 -.8407099 6.850461
-------------+---------------------------------------------------------
SIZE | 660 22.51583 1.594792 17.68055 26.63424
age | 647 20.85781 22.66347 0 141
Code:
xtreg TOBIN ENV SOC GOV RD_TR DE BETA SIZE age, fe
Fixed-effects (within) regression Number of obs = 278
Group variable: ID Number of groups = 58
R-squared: Obs per group:
Within = 0.6058 min = 1
Between = 0.4386 avg = 4.8
Overall = 0.5556 max = 8
F(8,212) = 40.73
corr(u_i, Xb) = -0.4431 Prob > F = 0.0000
------------------------------------------------------------------------------
TOBIN | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
ENV | -.000227 .0001361 -1.67 0.097 -.0004954 .0000413
SOC | .0001672 .0001384 1.21 0.228 -.0001057 .0004401
GOV | .0000361 .000101 0.36 0.722 -.0001631 .0002352
RD_TR | -.0295279 .012995 -2.27 0.024 -.0551439 -.0039119
DE | -.0012643 .0000784 -16.12 0.000 -.0014189 -.0011097
BETA | -.0040603 .0016222 -2.50 0.013 -.0072581 -.0008625
SIZE | -.0009862 .0036487 -0.27 0.787 -.0081785 .0062062
age | .0004106 .0006199 0.66 0.508 -.0008112 .0016325
_cons | 1.058147 .0831861 12.72 0.000 .8941694 1.222125
-------------+----------------------------------------------------------------
sigma_u | .02791698
sigma_e | .01557575
rho | .76260957 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(57, 212) = 5.39 Prob > F = 0.0000
. xttest3
Modified Wald test for groupwise heteroskedasticity
in fixed effect regression model
H0: sigma(i)^2 = sigma^2 for all i
chi2 (58) = 1.7e+29
Prob>chi2 = 0.0000
. xtreg TOBIN ENV SOC GOV RD_TR DE BETA SIZE age, fe vce(robust)
Fixed-effects (within) regression Number of obs = 278
Group variable: ID Number of groups = 58
R-squared: Obs per group:
Within = 0.6058 min = 1
Between = 0.4386 avg = 4.8
Overall = 0.5556 max = 8
F(8,57) = 33.75
corr(u_i, Xb) = -0.4431 Prob > F = 0.0000
(Std. err. adjusted for 58 clusters in ID)
------------------------------------------------------------------------------
| Robust
TOBIN | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
ENV | -.000227 .0001707 -1.33 0.189 -.0005689 .0001148
SOC | .0001672 .0001588 1.05 0.297 -.0001507 .0004852
GOV | .0000361 .000114 0.32 0.753 -.0001922 .0002644
RD_TR | -.0295279 .0253025 -1.17 0.248 -.0801953 .0211395
DE | -.0012643 .0001046 -12.09 0.000 -.0014738 -.0010549
BETA | -.0040603 .0013741 -2.95 0.005 -.006812 -.0013087
SIZE | -.0009862 .0050542 -0.20 0.846 -.011107 .0091347
age | .0004106 .0006892 0.60 0.554 -.0009694 .0017907
_cons | 1.058147 .113507 9.32 0.000 .8308535 1.285441
-------------+----------------------------------------------------------------
sigma_u | .02791698
sigma_e | .01557575
rho | .76260957 (fraction of variance due to u_i)
------------------------------------------------------------------------------
. xttest3
Modified Wald test for groupwise heteroskedasticity
in fixed effect regression model
H0: sigma(i)^2 = sigma^2 for all i
chi2 (58) = 1.7e+29
Prob>chi2 = 0.0000
