Hello! I am new to stata and am investigating the relationship between paid leave (maternal and parental) and female labour force participation rates with OECD paneldata. Literature that has also investigated this relationship uses a FE model with year dummies and country-specific time trends and find an inverse u-relationship. Unfortunately, I find different results and struggling with the fit of my data.
I have done multiple tests where I have found out that i should include year dummies, trends and clustered SE (due to heteroscedasticity and autocorrelation) and that I should rely on a FE, rather than a RE model.
When doing:
I get a weird extreme point and graphing the result also shows that it is probably not the best fit

When not including year dummies and trends, my fit seems better.
the u-test (presence of u-shape) becomes insignificant when clustering at country-level.

However the tests did say I should include those and my r-squared goes down drastically when doing so.
Model 1 is a naive OLS model, 2 a fixed effectsmodel without trends and year dummies, 3 is a fixed effects model with year dummies, 4 a fixed effects model with year dummies and trends, 5 is the same as 4 but with SE clustered at the country-level.
When including controls (GDP, female unemployment rate, childcare costs, birth rate), the problem stays the same and I lose a lot of observations as the data is not available for the same nr. of countries/years.
What do you think when seeing this? I do not know which model to go for or how to explain these results.
Thank you in advance
I have done multiple tests where I have found out that i should include year dummies, trends and clustered SE (due to heteroscedasticity and autocorrelation) and that I should rely on a FE, rather than a RE model.
When doing:
xtset cou_1 year
xtreg flfpr2554 moth_leave moth_leave_2 i.year i.cou_1#c.t, fe i(cou_1) vce(cluster cou_1)
nlcom _b[moth_leave]/(2*-_b[moth_leave_2])
adopath ++ "m:\p_arbeid\p_betaald_verlof\SSC install"
utest moth_leave moth_leave_2
(129 missing values generated)
Specification: f(x)=x^2
Extreme point: 538.5394
Test:
H1: Inverse U shape
vs. H0: Monotone or U shape
-------------------------------------------------
| Lower bound Upper bound
-----------------+-------------------------------
Interval | 0 198
Slope | .0351557 .0222303
-------------------------------------------------
Extremum outside interval - trivial failure to reject H0
xtreg flfpr2554 moth_leave moth_leave_2 i.year i.cou_1#c.t, fe i(cou_1) vce(cluster cou_1)
nlcom _b[moth_leave]/(2*-_b[moth_leave_2])
adopath ++ "m:\p_arbeid\p_betaald_verlof\SSC install"
utest moth_leave moth_leave_2
(129 missing values generated)
Specification: f(x)=x^2
Extreme point: 538.5394
Test:
H1: Inverse U shape
vs. H0: Monotone or U shape
-------------------------------------------------
| Lower bound Upper bound
-----------------+-------------------------------
Interval | 0 198
Slope | .0351557 .0222303
-------------------------------------------------
Extremum outside interval - trivial failure to reject H0
Code:
graph twoway (scatter flfpr2554 moth_leave) (function y=[44.7195]+x*[.0351557]+x^2*[-.0000326], range(0 800))
When not including year dummies and trends, my fit seems better.
Code:
use "m:\p_arbeid\p_betaald_verlof\Data\Stata files\7-1-regressions-2018", clear xtset cou_1 year gen moth_leave_2 = moth_leave*moth_leave xtreg flfpr2554 moth_leave moth_leave_2, fe i(cou_1) nlcom _b[moth_leave]/(2*-_b[moth_leave_2]) adopath ++ "m:\p_arbeid\p_betaald_verlof\SSC install\" utest moth_leave moth_leave_2 Specification: f(x)=x^2 Extreme point: 138.853 Test: H1: Inverse U shape vs. H0: Monotone or U shape ------------------------------------------------- | Lower bound Upper bound -----------------+------------------------------- Interval | 0 198 Slope | .4256187 -.1813002 t-value | 12.05258 -2.726574 P>|t| | 3.16e-31 .0032678 ------------------------------------------------- Overall test of presence of a Inverse U shape: t-value = 2.73 P>|t| = .00327
the u-test (presence of u-shape) becomes insignificant when clustering at country-level.
Code:
use "m:\p_arbeid\p_betaald_verlof\Data\Stata files\7-1-regressions-2018", clear xtset cou_1 year gen moth_leave_2 = moth_leave*moth_leave xtreg flfpr2554 moth_leave moth_leave_2, fe i(cou_1) vce(cluster cou_1) nlcom _b[moth_leave]/(2*-_b[moth_leave_2]) adopath ++ "m:\p_arbeid\p_betaald_verlof\SSC install\" utest moth_leave moth_leave_2 . utest moth_leave moth_leave_2 (129 missing values generated) Specification: f(x)=x^2 Extreme point: 138.853 Test: H1: Inverse U shape vs. H0: Monotone or U shape ------------------------------------------------- | Lower bound Upper bound -----------------+------------------------------- Interval | 0 198 Slope | .4256187 -.1813002 t-value | 3.384718 -.6482774 P>|t| | .0008667 .2604597 ------------------------------------------------- Overall test of presence of a Inverse U shape: t-value = 0.65 P>|t| = .26
However the tests did say I should include those and my r-squared goes down drastically when doing so.
Model 1 is a naive OLS model, 2 a fixed effectsmodel without trends and year dummies, 3 is a fixed effects model with year dummies, 4 a fixed effects model with year dummies and trends, 5 is the same as 4 but with SE clustered at the country-level.
Code:
(1) (2) (3) (4) (5) VARIABLES flfpr2554 flfpr2554 flfpr2554 flfpr2554 flfpr2554 moth_leave_div 16.1989*** 42.5619*** -13.2572*** 3.5156** 3.5156 (3.0404) (3.5313) (2.7618) (1.5361) (3.0979) moth_leave_div_2 -7.3050*** -15.3262*** 7.1199*** -0.3264 -0.3264 (1.4861) (2.3920) (1.6326) (0.9886) (1.7249) Constant 46.6809*** 52.0218*** 47.3923*** 44.7195*** 44.7195*** (4.8106) (1.1491) (1.9779) (0.9649) (2.0572) Observations 863 863 863 863 863 R-squared 0.3828 0.2037 0.7314 0.9618 0.9618 Specification Quadratic Quadratic Quadratic Quadratic Quadratic Method LPM FE FE and year dummies FE and year dummies FE and year dummies Controls NO NO NO NO NO Clustering NO NO NO NO YES Number of cou_1 37 37 37 37 37 Standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1
What do you think when seeing this? I do not know which model to go for or how to explain these results.
Thank you in advance
Comment