Hi everybody,
I am analyzing the effect of a school construction program on education. I am using individual level panel data (5 waves) matched over the birthplace with the school construction data.
To identify individuals who have been exposed to the program, I am using the variation in year of birth and region of birth. Individuals born between 1968 and 1972 are the treatment group, and cohorts 1958 to 1963 form the control group. I multiply this dummy with the treatment intensity of the school program in each region, calculated as schools built per 1,000 children (youngXnin). I add region of birth and year of birth fixed effects and cluster the standard errors at the region of birth level. Furthermore, I control for the pre-program enrollment rates, number of children and another policy implemented during the same time at the regional level, interacted with the year of birth. I tagged the individuals by their highest years of education (yoe).
I have run the following regression:
My result looks like this:
A lot of studies analyzed the effect of the program on education, e.g.,
Duflo, E. (2001). "Schooling and labor market consequences of school construction in Indonesia: Evidence from an unusual policy experiment." American economic review 91(4): 795-813.
Mazumder, B., et al. (2019). Intergenerational Human Capital Spillovers: Indonesia's School Construction and Its Effects on the Next Generation. AEA Papers and Proceedings.
They all find significant effects of the program on education. Duflo (2001) uses different data, but Mazumder et al. (2019) get data from the same source.
Question:
I am wondering where the differences in the magnitude and the significance of the estimates come from. I checked my data and code several times, but I don't see the problem.
Also, as soon as I add the fixed effects, the estimate on the treatment looses the significance. This would indicate that there is not enough variation between the cohorts and regions, right?
Do you have any ideas?
Any advice is appreciated!
I am analyzing the effect of a school construction program on education. I am using individual level panel data (5 waves) matched over the birthplace with the school construction data.
To identify individuals who have been exposed to the program, I am using the variation in year of birth and region of birth. Individuals born between 1968 and 1972 are the treatment group, and cohorts 1958 to 1963 form the control group. I multiply this dummy with the treatment intensity of the school program in each region, calculated as schools built per 1,000 children (youngXnin). I add region of birth and year of birth fixed effects and cluster the standard errors at the region of birth level. Furthermore, I control for the pre-program enrollment rates, number of children and another policy implemented during the same time at the regional level, interacted with the year of birth. I tagged the individuals by their highest years of education (yoe).
I have run the following regression:
Code:
areg yoe youngXnin i.yob i.yob i.yob#c.en71 i.yob#c.ch71 i.yob#c.wsppc female if tag==1, abs(birthpl) cluster(birthpl)
Code:
note: 1972.yob#c.en71 omitted because of collinearity
note: 1972.yob#c.ch71 omitted because of collinearity
note: 1972.yob#c.wsppc omitted because of collinearity
Linear regression, absorbing indicators Number of obs = 5,986
Absorbed variable: birthpl No. of categories = 242
F( 42, 241) = 35.82
Prob > F = 0.0000
R-squared = 0.2778
Adj R-squared = 0.2420
Root MSE = 3.4042
(Std. Err. adjusted for 242 clusters in birthpl)
------------------------------------------------------------------------------
| Robust
yoe | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
youngXnin | .1817468 .1288603 1.41 0.160 -.0720894 .435583
female | -1.008165 .1054618 -9.56 0.000 -1.21591 -.8004209
|
yob |
1958 | -1.454403 .6948977 -2.09 0.037 -2.823252 -.0855548
1959 | 1.006075 .5946716 1.69 0.092 -.1653431 2.177492
1960 | 1.720255 .6202648 2.77 0.006 .4984226 2.942088
1961 | .5529631 .5809999 0.95 0.342 -.5915233 1.697449
1962 | 1.401703 .5385698 2.60 0.010 .3407979 2.462608
1968 | 2.022092 .5760126 3.51 0.001 .8874304 3.156755
1969 | 3.192242 .5136172 6.22 0.000 2.18049 4.203994
1970 | 3.68983 .5406953 6.82 0.000 2.624738 4.754922
1971 | 3.338817 .5653397 5.91 0.000 2.225179 4.452455
1972 | 3.322119 .6062628 5.48 0.000 2.127868 4.51637
|
yob#c.en71 |
1957 | -3.188886 3.273782 -0.97 0.331 -9.637765 3.259993
1958 | 3.326529 2.634466 1.26 0.208 -1.862991 8.516049
1959 | .0702168 1.809247 0.04 0.969 -3.49374 3.634174
1960 | -2.737319 1.668069 -1.64 0.102 -6.023176 .5485377
1961 | -1.312241 2.450678 -0.54 0.593 -6.139723 3.515242
1962 | .7871137 1.437824 0.55 0.585 -2.045193 3.619421
1968 | 1.592497 1.799092 0.89 0.377 -1.951456 5.13645
1969 | .3691274 1.573141 0.23 0.815 -2.729733 3.467988
1970 | 1.954537 1.64173 1.19 0.235 -1.279435 5.188509
1971 | .1603102 1.921413 0.08 0.934 -3.624596 3.945217
1972 | 0 (omitted)
|
yob#c.ch71 |
1957 | 5.17e-06 2.41e-06 2.15 0.033 4.30e-07 9.92e-06
1958 | 4.08e-06 2.19e-06 1.86 0.064 -2.43e-07 8.40e-06
1959 | -7.02e-07 2.20e-06 -0.32 0.750 -5.04e-06 3.64e-06
1960 | -6.26e-07 2.18e-06 -0.29 0.774 -4.91e-06 3.66e-06
1961 | 2.50e-06 2.22e-06 1.12 0.262 -1.88e-06 6.88e-06
1962 | -1.28e-06 2.22e-06 -0.58 0.564 -5.66e-06 3.09e-06
1968 | 1.53e-06 2.35e-06 0.65 0.516 -3.11e-06 6.17e-06
1969 | -8.46e-07 2.15e-06 -0.39 0.695 -5.09e-06 3.40e-06
1970 | -1.76e-06 1.95e-06 -0.90 0.369 -5.61e-06 2.09e-06
1971 | -8.76e-07 2.00e-06 -0.44 0.662 -4.82e-06 3.06e-06
1972 | 0 (omitted)
|
yob#c.wsppc |
1957 | 1.761464 1.080673 1.63 0.104 -.3673056 3.890234
1958 | 2.369885 .9174623 2.58 0.010 .562616 4.177154
1959 | .8455982 .492575 1.72 0.087 -.1247038 1.8159
1960 | .0375649 .4819009 0.08 0.938 -.9117106 .9868404
1961 | 1.813205 .8031514 2.26 0.025 .2311121 3.395298
1962 | .4528254 .3856467 1.17 0.241 -.3068432 1.212494
1968 | .0566797 .4649344 0.12 0.903 -.8591742 .9725336
1969 | .3267577 .306535 1.07 0.288 -.2770721 .9305876
1970 | -.6072225 .4483307 -1.35 0.177 -1.490369 .2759245
1971 | .5620162 .4839056 1.16 0.247 -.3912082 1.515241
1972 | 0 (omitted)
|
_cons | 5.588359 .5779029 9.67 0.000 4.449974 6.726745
------------------------------------------------------------------------------
.
A lot of studies analyzed the effect of the program on education, e.g.,
Duflo, E. (2001). "Schooling and labor market consequences of school construction in Indonesia: Evidence from an unusual policy experiment." American economic review 91(4): 795-813.
Mazumder, B., et al. (2019). Intergenerational Human Capital Spillovers: Indonesia's School Construction and Its Effects on the Next Generation. AEA Papers and Proceedings.
They all find significant effects of the program on education. Duflo (2001) uses different data, but Mazumder et al. (2019) get data from the same source.
Question:
I am wondering where the differences in the magnitude and the significance of the estimates come from. I checked my data and code several times, but I don't see the problem.
Also, as soon as I add the fixed effects, the estimate on the treatment looses the significance. This would indicate that there is not enough variation between the cohorts and regions, right?
Do you have any ideas?
Any advice is appreciated!

Comment