Good morning and all the best wishes for 2023,
I have an unbalanced panel data set with T>N. Specifically, it contains 59 forward contracts (panel id) in the gas market with daily data of their price movements from January 2019 until August 2022 (time id). The only difference between these forward contracts is their date of maturity, with the maturity dates ranging from February 2019, March 2019, ..., until December 2023. Since most contracts mature during my data period, the panel data is unbalanced. To explain, the forward maturing in e.g. April 2019 only has data from January 2019 until the end of March 2019, since after this period it does not exist anymore. With this dataset, I am running two models based on FE / Pooled OLS, with one predicting forward prices and one predicting forward prices minus spot prices. However, I am struggling with the issue of autocorrelation.
It seems quite clear that model suffer from autocorrelation since the price of today depends on the price of yesterday (e.g. due to shocks not captured by the model). To address this, I found that many people use vce(cluster clusterid), for which either the panelid or timeid is chosen as clusterid. However, in my context I am not sure which of the two is more suitable. It seems that if there is a shock in the gas market not captured by the model, all existing forward contracts are affected at the same time for a number of periods, leading me to think that it should be clustered on time. However, forwards near maturity are affected more strongly by such a shock than forwards maturing further in the future, leading to a different effect per panelid, making me lead to think that it should be clustered on panelid.
What would be the best approach to cluster the standard errors? I have also noticed the newey or newey2 option in time series data, which could be applied to panel data with the force option. Would this be a better approach than choosing between panelid/timeid with clustering? Another alternative would be to include day dummies but since my dataset has >900 days and the magnitude of the shock differs per forward, I believe this is not a suitable solution. In addition, the last days of my dataset only have 16 observations each (that is in August 2022 only the forwards ending between September 2022 and December 2023 are remaining).
Hopefully, someone can help me on this matter. I want to avoid using lagged forward prices as an explanatory variable, as I believe this significantly biases the remaining estimates.
Thank you in advance,
Best,
Stefan
Example of data (Note that I have not included all explanatory variables here):
I have an unbalanced panel data set with T>N. Specifically, it contains 59 forward contracts (panel id) in the gas market with daily data of their price movements from January 2019 until August 2022 (time id). The only difference between these forward contracts is their date of maturity, with the maturity dates ranging from February 2019, March 2019, ..., until December 2023. Since most contracts mature during my data period, the panel data is unbalanced. To explain, the forward maturing in e.g. April 2019 only has data from January 2019 until the end of March 2019, since after this period it does not exist anymore. With this dataset, I am running two models based on FE / Pooled OLS, with one predicting forward prices and one predicting forward prices minus spot prices. However, I am struggling with the issue of autocorrelation.
It seems quite clear that model suffer from autocorrelation since the price of today depends on the price of yesterday (e.g. due to shocks not captured by the model). To address this, I found that many people use vce(cluster clusterid), for which either the panelid or timeid is chosen as clusterid. However, in my context I am not sure which of the two is more suitable. It seems that if there is a shock in the gas market not captured by the model, all existing forward contracts are affected at the same time for a number of periods, leading me to think that it should be clustered on time. However, forwards near maturity are affected more strongly by such a shock than forwards maturing further in the future, leading to a different effect per panelid, making me lead to think that it should be clustered on panelid.
What would be the best approach to cluster the standard errors? I have also noticed the newey or newey2 option in time series data, which could be applied to panel data with the force option. Would this be a better approach than choosing between panelid/timeid with clustering? Another alternative would be to include day dummies but since my dataset has >900 days and the magnitude of the shock differs per forward, I believe this is not a suitable solution. In addition, the last days of my dataset only have 16 observations each (that is in August 2022 only the forwards ending between September 2022 and December 2023 are remaining).
Hopefully, someone can help me on this matter. I want to avoid using lagged forward prices as an explanatory variable, as I believe this significantly biases the remaining estimates.
Thank you in advance,
Best,
Stefan
Example of data (Note that I have not included all explanatory variables here):
Code:
* Example generated by -dataex-. For more info, type help dataex clear input long bcal_date byte ForwardIDNotchronological str6 Forward double(Last FminS StorageDifferentialpp HDD) float TimeToMaturityinDays 0 1 "Apr-19" 21.073 -1.2649116516113281 4.154918685181064 14.956843224212369 85 1 1 "Apr-19" 20.9 -.9370609521865845 3.9909131097125794 17.365878184036973 84 2 1 "Apr-19" 21.375 -.7146892547607422 3.9044374292301898 17.07259541833382 83 3 1 "Apr-19" 20.504 -.18523378670215607 4.202203443949603 14.724739683093697 80 4 1 "Apr-19" 20.958 -.8720952868461609 4.246697818940581 15.69467289307796 79 5 1 "Apr-19" 20.573 -.6182765364646912 4.0693012133582895 15.787276165430324 78 6 1 "Apr-19" 20.524 -1.4630166292190552 3.8222247533405196 16.466272011858738 77 7 1 "Apr-19" 20.94 -.49131256341934204 3.8908445034195216 17.29398776301633 76 8 1 "Apr-19" 20.525 -.8998981714248657 4.7744742013453685 12.99215511607172 73 9 1 "Apr-19" 20.898 -.34439921379089355 4.501237387327883 13.51847291901612 72 10 1 "Apr-19" 21 -.7477397918701172 4.562905460719746 14.261926277053735 71 11 1 "Apr-19" 21.919 -.8131643533706665 4.664170117632693 13.503087491300732 70 12 1 "Apr-19" 21.725 -.5850738286972046 4.681253153778364 15.081026880807627 69 13 1 "Apr-19" 21.162 -.6925017237663269 4.650215965856718 19.088969293989685 66 14 1 "Apr-19" 21.184 -.5423412322998047 4.411772354184307 18.388939906045923 65 15 1 "Apr-19" 21.505 -.49548637866973877 3.9245698934585627 18.89483980943654 64 16 1 "Apr-19" 20.709 -.5355201959609985 3.7186900750836083 17.282377100504643 63 17 1 "Apr-19" 20.436 -.4014385938644409 3.5360547008440046 18.10689224762214 62 18 1 "Apr-19" 20.135 -.4560987949371338 3.7687662984285386 14.874321544088811 59 19 1 "Apr-19" 20.206 -.6376794576644897 3.566352326421418 15.974830294731296 58 20 1 "Apr-19" 19.802 -.38830000162124634 3.29833195203042 17.493967571237057 57 21 1 "Apr-19" 19.477 -.5069153308868408 2.981014316737751 16.76584064261391 56 22 1 "Apr-19" 19.068 -.1927688866853714 2.799392523755795 14.597080048901223 55 23 1 "Apr-19" 19.149 -.2338964194059372 2.637593057319798 15.876250634486649 52 24 1 "Apr-19" 18.938 .2525656521320343 2.5209027401734807 17.5473240535079 51 25 1 "Apr-19" 18.676 -.35440123081207275 2.337756503306138 16.447075361083506 50 26 1 "Apr-19" 18.631 .023129897192120552 2.6294237438272807 15.99565273565843 49 27 1 "Apr-19" 17.987 -.3012709617614746 2.8100871063979294 13.490421687925783 48 28 1 "Apr-19" 17.897 -.39309626817703247 3.4759679360977955 12.706924209106468 45 29 1 "Apr-19" 17.635 .16231514513492584 3.497102498871546 13.527162965627497 44 30 1 "Apr-19" 17.897 -.2247193604707718 3.50327612733059 14.245016166563882 43 31 1 "Apr-19" 17.671 -.09852362424135208 3.608145819989433 14.409795158178559 42 32 1 "Apr-19" 18.039 .0692562684416771 3.8185903648961848 13.959856574551733 41 33 1 "Apr-19" 17.379 .011648467741906643 4.835302601213643 12.870984868418315 38 34 1 "Apr-19" 17.735 -.16109336912631989 5.0138404787663955 13.067572059304243 37 35 1 "Apr-19" 18.149 .03270895406603813 5.149972499649319 12.789025705567967 36 36 1 "Apr-19" 17.64 .1207510381937027 4.163221041799048 12.821580226979757 35 37 1 "Apr-19" 17.382 .4333782494068146 4.583521163617449 10.535112088816401 34 38 1 "Apr-19" 17.188 .3177798092365265 5.516217922493244 11.949117863930839 31 39 1 "Apr-19" 17.223 -.12606196105480194 5.86440717274691 10.364240168499009 30 40 1 "Apr-19" 17.686 -.11533438414335251 6.211936207064883 9.82799481215315 29 41 1 "Apr-19" 17.814 .03559957817196846 6.55425331481887 9.590275637042588 28 42 1 "Apr-19" 17.321 -.13209639489650726 6.988544178189588 10.20542306151627 27 43 1 "Apr-19" 17.4 -.1615520417690277 8.468005955620184 9.06985082134012 24 44 1 "Apr-19" 17.181 .07196260243654251 8.679538929426261 9.74850269941683 23 45 1 "Apr-19" 16.851 -.10439474135637283 8.951534139386425 9.303878171684856 22 46 1 "Apr-19" 16.868 -.07230380177497864 9.341177956424795 7.570277884266103 21 47 1 "Apr-19" 16.729 -.059389207512140274 9.523276257364127 8.7250182756284 20 48 1 "Apr-19" 16.623 -.2671271860599518 10.147491233144601 10.819773317361037 17 49 1 "Apr-19" 16.209 -.305850625038147 10.123554188792166 11.202640743987685 16 50 1 "Apr-19" 15.725 -.18612995743751526 10.129295641529223 10.754927092076983 15 51 1 "Apr-19" 15.603 .07213623821735382 10.186246922094943 10.899830951251657 14 52 1 "Apr-19" 15.252 .3901197016239166 10.384121232516453 8.913920156042902 13 53 1 "Apr-19" 14.98 -.43232014775276184 10.916931431107729 10.549083139608882 10 54 1 "Apr-19" 15.003 -.23918463289737701 10.870011381154077 11.520664442985296 9 55 1 "Apr-19" 15.213 .11791962385177612 10.90696353719413 11.119413828431366 8 56 1 "Apr-19" 14.563 .00572518166154623 11.049927434990375 10.741380673223665 7 57 1 "Apr-19" 14.454 .254000186920166 11.391536955374848 8.704985444694895 6 58 1 "Apr-19" 14.185 -.16500037908554077 12.438116395037829 8.539151775709938 3 59 1 "Apr-19" 14.384 -.31306058168411255 12.498370361280914 10.685778164916533 2 60 1 "Apr-19" 14.994 .052969496697187424 12.539270947648069 10.62525399933752 1 61 1 "Apr-19" 14.428 .23794034123420715 12.561212696076007 9.232591255840084 0 0 2 "Apr-20" 19.873 -2.464911699295044 4.154918685181064 14.956843224212369 453 1 2 "Apr-20" 19.768 -2.069061040878296 3.9909131097125794 17.365878184036973 452 2 2 "Apr-20" 20.279 -1.8106892108917236 3.9044374292301898 17.07259541833382 451 3 2 "Apr-20" 19.987 -.7022337913513184 4.202203443949603 14.724739683093697 448 4 2 "Apr-20" 20.263 -1.5670952796936035 4.246697818940581 15.69467289307796 447 5 2 "Apr-20" 20.078 -1.1132766008377075 4.0693012133582895 15.787276165430324 446 6 2 "Apr-20" 20.133 -1.854016661643982 3.8222247533405196 16.466272011858738 445 7 2 "Apr-20" 20.362 -1.069312572479248 3.8908445034195216 17.29398776301633 444 8 2 "Apr-20" 20.225 -1.19989812374115 4.7744742013453685 12.99215511607172 441 9 2 "Apr-20" 20.467 -.7753992080688477 4.501237387327883 13.51847291901612 440 10 2 "Apr-20" 20.737 -1.010739803314209 4.562905460719746 14.261926277053735 439 11 2 "Apr-20" 21.292 -1.44016432762146 4.664170117632693 13.503087491300732 438 12 2 "Apr-20" 21.495 -.8150738477706909 4.681253153778364 15.081026880807627 437 13 2 "Apr-20" 21.247 -.607501745223999 4.650215965856718 19.088969293989685 434 14 2 "Apr-20" 21.17 -.5563412308692932 4.411772354184307 18.388939906045923 433 15 2 "Apr-20" 21.369 -.6314863562583923 3.9245698934585627 18.89483980943654 432 16 2 "Apr-20" 20.846 -.39852020144462585 3.7186900750836083 17.282377100504643 431 17 2 "Apr-20" 20.859 .02156141586601734 3.5360547008440046 18.10689224762214 430 18 2 "Apr-20" 20.695 .1039012148976326 3.7687662984285386 14.874321544088811 427 19 2 "Apr-20" 20.864 .020320571959018707 3.566352326421418 15.974830294731296 426 20 2 "Apr-20" 20.77 .5796999931335449 3.29833195203042 17.493967571237057 425 21 2 "Apr-20" 20.654 .6700846552848816 2.981014316737751 16.76584064261391 424 22 2 "Apr-20" 20.498 1.2372311353683472 2.799392523755795 14.597080048901223 423 23 2 "Apr-20" 20.455 1.0721036195755005 2.637593057319798 15.876250634486649 420 24 2 "Apr-20" 20.359 1.6735656261444092 2.5209027401734807 17.5473240535079 419 25 2 "Apr-20" 20.263 1.2325987815856934 2.337756503306138 16.447075361083506 418 26 2 "Apr-20" 20.182 1.5741299390792847 2.6294237438272807 15.99565273565843 417 27 2 "Apr-20" 19.818 1.5297290086746216 2.8100871063979294 13.490421687925783 416 28 2 "Apr-20" 19.837 1.5469037294387817 3.4759679360977955 12.706924209106468 413 29 2 "Apr-20" 19.647 2.1743152141571045 3.497102498871546 13.527162965627497 412 30 2 "Apr-20" 19.932 1.810280680656433 3.50327612733059 14.245016166563882 411 31 2 "Apr-20" 19.81 2.0404763221740723 3.608145819989433 14.409795158178559 410 32 2 "Apr-20" 20.141 2.1712563037872314 3.8185903648961848 13.959856574551733 409 33 2 "Apr-20" 19.692 2.324648380279541 4.835302601213643 12.870984868418315 406 34 2 "Apr-20" 20.073 2.1769065856933594 5.0138404787663955 13.067572059304243 405 35 2 "Apr-20" 20.302 2.185708999633789 5.149972499649319 12.789025705567967 404 36 2 "Apr-20" 20.016 2.496751070022583 4.163221041799048 12.821580226979757 403 37 2 "Apr-20" 19.904 2.955378293991089 4.583521163617449 10.535112088816401 402 end format %tbcalendar bcal_date
Comment