Hi,
few days ago I opened a thread asking for some advice regarding my model specification where Carlo Lazzaro gave very useful feedback. The thread can be found here (with example data).
Until know I was pretty sure that -fe- would be the right approach as 99% of the research done on my research question applies -fe-.
Just to be sure I spend the last few days playing around with some tests for deciding which model to choose.
As I can assume to have heteroskedastic and intragroup correlation I cannot really rely on the Hausman test and followed the advice to use the command -xtoverid- first, which gave me the following output:
Input:
As done by nearly every single paper on my research topic I winsorized all variables at the 1st and 99th percentile values (even though knowing there is a hard debate on winsorizing data). However, after doing so, the result of -xtoverid- changed substantially:
I was even more confused after applying the mundlak approach (on the winsorized data):
Here, the Prob > chi2 = 0.0002, highly suggesting that -fe- is the way to go.
However, I am not sure whether the mundlak approach as done here accounts for heteroskedastic and intragroup correlation. As I couldn't find a -robust- option in the help file I tried to rebuild the approach following this post: https://blog.stata.com/2015/10/29/fi...dlak-approach/
Doing so results in Prob > chi2 = 0.3076 which is again far away from the results I got above.
My questions would be:
1. Is it the case that xtoverid is very sensitive to outliers? (Without seeing my entire dataset this question may have no obvious answer)
2. Do you have any ideas on why the results between xtoverid and mundlak differ so much?
3. What did I do wrong when applying the manual mundlak approach that would explain the huge difference between both mundlak approaches?
Thanks in advance
few days ago I opened a thread asking for some advice regarding my model specification where Carlo Lazzaro gave very useful feedback. The thread can be found here (with example data).
Until know I was pretty sure that -fe- would be the right approach as 99% of the research done on my research question applies -fe-.
Just to be sure I spend the last few days playing around with some tests for deciding which model to choose.
As I can assume to have heteroskedastic and intragroup correlation I cannot really rely on the Hausman test and followed the advice to use the command -xtoverid- first, which gave me the following output:
Input:
Code:
. xtreg Acq_CAR_1_1_ES2 CFO_PaySlice CFO_No_Boardsitze CFO_No_Deals CFO_Perc_Own_Dir CFO_Board CFO_Age CFO_Gender CFO_MBA CFO_CPA CFO_Tenure Deal_Value Targ_Listed Deal_S
> tructure Deal_No_Bidders Deal_Div_FF12 Acq_MktValue Acq_Leverage Acq_ROA Acq_Cash_holdings Acq_TobinsQ Acq_FCF Acq_No_Deals, re
Random-effects GLS regression Number of obs = 2,521
Group variable: Acq_ID Number of groups = 980
R-squared: Obs per group:
Within = 0.0212 min = 1
Between = 0.0521 avg = 2.6
Overall = 0.0331 max = 75
Wald chi2(20) = .
corr(u_i, X) = 0 (assumed) Prob > chi2 = .
-----------------------------------------------------------------------------------
Acq_CAR_1_1_ES2 | Coefficient Std. err. z P>|z| [95% conf. interval]
------------------+----------------------------------------------------------------
CFO_PaySlice | .0106131 .0198141 0.54 0.592 -.0282218 .0494481
CFO_No_Boardsitze | -.0000494 .001944 -0.03 0.980 -.0038597 .0037609
CFO_No_Deals | -.0002787 .0006399 -0.44 0.663 -.0015329 .0009754
CFO_Perc_Own_Dir | -.0000603 .0002286 -0.26 0.792 -.0005082 .0003877
CFO_Board | .0060153 .0070377 0.85 0.393 -.0077783 .019809
CFO_Age | 1.08e-06 .0002584 0.00 0.997 -.0005054 .0005076
CFO_Gender | .0039131 .005748 0.68 0.496 -.0073527 .015179
CFO_MBA | -.0054712 .0032979 -1.66 0.097 -.0119349 .0009926
CFO_CPA | .0008035 .0032792 0.25 0.806 -.0056236 .0072305
CFO_Tenure | -9.74e-07 1.48e-06 -0.66 0.511 -3.88e-06 1.93e-06
Deal_Value | -1.81e-12 4.01e-13 -4.51 0.000 -2.60e-12 -1.02e-12
Targ_Listed | -.0133341 .0037269 -3.58 0.000 -.0206386 -.0060295
Deal_Structure | .0000302 .000617 0.05 0.961 -.001179 .0012394
Deal_No_Bidders | .023375 .0123418 1.89 0.058 -.0008145 .0475645
Deal_Div_FF12 | -.0007084 .0032532 -0.22 0.828 -.0070846 .0056678
Acq_MktValue | 3.92e-11 4.11e-11 0.95 0.341 -4.14e-11 1.20e-10
Acq_Leverage | .0058658 .0080689 0.73 0.467 -.0099489 .0216806
Acq_ROA | .0101078 .0252952 0.40 0.689 -.0394698 .0596855
Acq_Cash_holdings | -.019994 .0094826 -2.11 0.035 -.0385794 -.0014085
Acq_TobinsQ | -.0013971 .0003883 -3.60 0.000 -.0021581 -.000636
Acq_FCF | .0430743 .0285144 1.51 0.131 -.0128129 .0989614
Acq_No_Deals | -.0004628 .0002621 -1.77 0.077 -.0009765 .0000508
_cons | -.0189153 .0196203 -0.96 0.335 -.0573704 .0195397
------------------+----------------------------------------------------------------
sigma_u | .03276371
sigma_e | .06290979
rho | .21336491 (fraction of variance due to u_i)
-----------------------------------------------------------------------------------
. xtoverid, robust cluster(Acq_ID)
Test of overidentifying restrictions: fixed vs random effects
Cross-section time-series model: xtreg re robust cluster(Acq_ID)
Sargan-Hansen statistic 31.518 Chi-sq(20) P-value = 0.0487
Code:
. xtoverid, robust cluster(Acq_ID) Test of overidentifying restrictions: fixed vs random effects Cross-section time-series model: xtreg re robust cluster(Acq_ID) Sargan-Hansen statistic 23.705 Chi-sq(20) P-value = 0.2555
Code:
mundlak Acq_CAR_1_1_ES2 CFO_PaySlice CFO_No_Boardsitze CFO_No_Deals CFO_Perc_Own_Dir CFO_Board CFO_Age CFO_Gender CFO_MBA CFO_CPA CFO_Tenure Deal_Value Targ_Listed Deal_Structure Deal_No_Bidders Deal_Div_FF12 Acq_MktValue Acq_Leverage Acq_ROA Acq_Cash_holdings Acq_TobinsQ Acq_FCF Acq_No_Deals estimates replay Mundlak test
However, I am not sure whether the mundlak approach as done here accounts for heteroskedastic and intragroup correlation. As I couldn't find a -robust- option in the help file I tried to rebuild the approach following this post: https://blog.stata.com/2015/10/29/fi...dlak-approach/
Code:
//Mundlak manually bysort Acq_ID: egen mean_x2 = mean(CFO_PaySlice) bysort Acq_ID: egen mean_x3 = mean(CFO_No_Boardsitze) bysort Acq_ID: egen mean_x4 = mean(CFO_No_Deals) bysort Acq_ID: egen mean_x5 = mean(CFO_Perc_Own_Dir) bysort Acq_ID: egen mean_x6 = mean(CFO_Board) bysort Acq_ID: egen mean_x7 = mean(CFO_Age) bysort Acq_ID: egen mean_x8 = mean(CFO_Gender) bysort Acq_ID: egen mean_x9 = mean(CFO_MBA) bysort Acq_ID: egen mean_x10 = mean(CFO_CPA) bysort Acq_ID: egen mean_x11 = mean(CFO_Tenure) bysort Acq_ID: egen mean_x12 = mean(Deal_Value) bysort Acq_ID: egen mean_x13 = mean(Targ_Listed) bysort Acq_ID: egen mean_x14 = mean(Deal_Structure) bysort Acq_ID: egen mean_x15 = mean(Deal_No_Bidders) bysort Acq_ID: egen mean_x16 = mean(Deal_Div_FF12) bysort Acq_ID: egen mean_x17 = mean(Acq_MktValue) bysort Acq_ID: egen mean_x18 = mean(Acq_Leverage) bysort Acq_ID: egen mean_x19 = mean(Acq_ROA) bysort Acq_ID: egen mean_x20 = mean(Acq_Cash_holdings) bysort Acq_ID: egen mean_x21 = mean(Acq_TobinsQ) bysort Acq_ID: egen mean_x22 = mean(Acq_FCF) bysort Acq_ID: egen mean_x23 = mean(Acq_No_Deals) quietly xtreg Acq_CAR_1_1_ES2 CFO_PaySlice CFO_No_Boardsitze CFO_No_Deals CFO_Perc_Own_Dir CFO_Board CFO_Age CFO_Gender CFO_MBA CFO_CPA CFO_Tenure Deal_Value Targ_Listed Deal_Structure Deal_No_Bidders Deal_Div_FF12 Acq_MktValue Acq_Leverage Acq_ROA Acq_Cash_holdings Acq_TobinsQ Acq_FCF Acq_No_Deals mean_x*, vce(cluster Acq_ID) estimates store mundlak test mean_x2 mean_x3 mean_x4 mean_x5 mean_x6 mean_x7 mean_x8 mean_x9 mean_x10 mean_x11 mean_x12 mean_x13 mean_x14 mean_x15 mean_x16 mean_x17 mean_x18 mean_x19 mean_x20 mean_x21 mean_x22 mean_x23
My questions would be:
1. Is it the case that xtoverid is very sensitive to outliers? (Without seeing my entire dataset this question may have no obvious answer)
2. Do you have any ideas on why the results between xtoverid and mundlak differ so much?
3. What did I do wrong when applying the manual mundlak approach that would explain the huge difference between both mundlak approaches?
Thanks in advance

Comment