Hi, my name is David and this is the first time posting to this forum.
I'm actually coming from R, I hope you are not to harsh on the code below for not beeing too elegant.
Say I have a panel of four banks, which are overseen by 5 administrative authorities ("Admin_A, ... "Admin_E").
As is indicated by the data, in 2019 Banks C and D are overseen by Authority C and D ("Admin_C", "Admin_D") , which merge in 2020 and form authority E ("Admin_E"). Therefore, the units of observations (the banks), are not "nested within clusters", as they appear within multiple clusters (which does reflect the "real world" and is factually correct, i.e., there is no error in the data).
I am wondering about the correct econometric approach in such a situation (assuming that clusted standard errors are the "way to go").
Below is a worked out minimal working example.
Best wishes,
David.
I'm actually coming from R, I hope you are not to harsh on the code below for not beeing too elegant.
Say I have a panel of four banks, which are overseen by 5 administrative authorities ("Admin_A, ... "Admin_E").
As is indicated by the data, in 2019 Banks C and D are overseen by Authority C and D ("Admin_C", "Admin_D") , which merge in 2020 and form authority E ("Admin_E"). Therefore, the units of observations (the banks), are not "nested within clusters", as they appear within multiple clusters (which does reflect the "real world" and is factually correct, i.e., there is no error in the data).
I am wondering about the correct econometric approach in such a situation (assuming that clusted standard errors are the "way to go").
- Do I use the "nonest" option and force Stata to compute clusterered standard errors anyway?
- Do I "fix" clusters as they are in the first period (2019) and assume that banks C and D are continued to be overseen by Admin C and D, which however does not reflect the real data (and would assume that admin E does not exist, therefore irgnoring any effects the mergers has on superviesed banks)
- Any other appropriate solutions, that I did not yet come up with (and the literature is not very clear on this issue).
Below is a worked out minimal working example.
Best wishes,
David.
Code:
version 17, clear all // Should work on older versions as well. clear all /* Example Dataset -> The actual numbers do not matter. */ /* Generate Data */ input str20 firm year y cvar1 cvar2 str20 clust "Bank_A" 2019 0.090 0.324 0.234 "Admin_A" "Bank_A" 2020 0.808 0.234 0.182 "Admin_A" "Bank_A" 2021 1.592 8.289 1.582 "Admin_A" "Bank_B" 2019 8.294 5.283 1.534 "Admin_B" "Bank_B" 2020 7.284 4.272 1.643 "Admin_B" "Bank_B" 2021 5.298 2.524 -5.25 "Admin_B" "Bank_C" 2019 8.252 2.553 1.53 "Admin_C" "Bank_C" 2020 6.153 6.535 8.535 "Admin_E" "Bank_C" 2021 5.255 2.645 1.564 "Admin_E" "Bank_D" 2019 4.253 5.256 2.654 "Admin_D" "Bank_D" 2020 5.256 0.532 5.285 "Admin_E" "Bank_D" 2021 6.594 5.352 1.564 "Admin_E" end encode(firm), gen(firm_enc) summarize // Set Panel IDs xtset firm_enc year // Run TWFE Regressin and force clustered standard errors. xtreg y cvar1 cvar2 i.year , fe vce(cl clust) nonest
Comment