Hello, all Statalists!
I want to measure the effect of rail station opening in certain city on the unemployment rate of the citizens of this city.
In particular, my dataset is a panel data for years 2010-2019.
I explore the opening of rail stations in cities (240, 874, 7700, 9200) on the same day in October 2016.
"code" variable is zip code and "unemployed_rate" is the unemployment rate
Thus,
I constructed a simple difference in differences model using year and city fixed-effects: 
Here is a subset of my dataset:
Following my model specification, I run this fixed-effect regression:
I have two questions:
I want to measure the effect of rail station opening in certain city on the unemployment rate of the citizens of this city.
In particular, my dataset is a panel data for years 2010-2019.
I explore the opening of rail stations in cities (240, 874, 7700, 9200) on the same day in October 2016.
"code" variable is zip code and "unemployed_rate" is the unemployment rate
Thus,
Code:
* Define treatment gen treat = 0 replace treat = 1 if (code == 9200) | (code == 7700) | (code == 874) | (code == 240) * Define post treatment period gen post = (year>2016) * Define interaction term gen treatXpost = treat*post
Here is a subset of my dataset:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input double unemployed_rate float(treat post treatXpost) int(year code) 5.1819999999999995 1 0 0 2010 240 5.051666666666668 1 0 0 2011 240 4.669999999999999 1 0 0 2012 240 4.8933333333333335 1 0 0 2013 240 4.400833333333335 1 0 0 2014 240 3.750833333333333 1 0 0 2015 240 3.5975000000000006 1 0 0 2016 240 3.045833333333334 1 1 1 2017 240 3.2591666666666663 1 1 1 2018 240 3.4866666666666664 1 1 1 2019 240 8.174166666666666 1 0 0 2010 874 7.675 1 0 0 2011 874 7.6675 1 0 0 2012 874 7.560833333333335 1 0 0 2013 874 6.923333333333334 1 0 0 2014 874 6.148333333333333 1 0 0 2015 874 5.636666666666666 1 0 0 2016 874 5.0441666666666665 1 1 1 2017 874 5.184166666666667 1 1 1 2018 874 5.230833333333332 1 1 1 2019 874 6.3954545454545455 0 0 0 2010 2800 6.121666666666667 0 0 0 2011 2800 5.351666666666667 0 0 0 2012 2800 6.265 0 0 0 2013 2800 6.279999999999999 0 0 0 2014 2800 5.345833333333333 0 0 0 2015 2800 5.319999999999999 0 0 0 2016 2800 4.803333333333334 0 1 0 2017 2800 4.550833333333334 0 1 0 2018 2800 4.64 0 1 0 2019 2800 9.03 0 0 0 2010 6700 8.739999999999998 0 0 0 2011 6700 8.588333333333333 0 0 0 2012 6700 8.024166666666666 0 0 0 2013 6700 7.1475 0 0 0 2014 6700 6.625 0 0 0 2015 6700 6.050000000000001 0 0 0 2016 6700 5.246666666666667 0 1 0 2017 6700 5.155 0 1 0 2018 6700 5.3933333333333335 0 1 0 2019 6700 8.693 1 0 0 2010 7700 8.064166666666669 1 0 0 2011 7700 7.32 1 0 0 2012 7700 7.219166666666667 1 0 0 2013 7700 6.536666666666667 1 0 0 2014 7700 6.0841666666666665 1 0 0 2015 7700 5.583333333333334 1 0 0 2016 7700 4.894166666666666 1 1 1 2017 7700 4.870833333333334 1 1 1 2018 7700 5.370000000000002 1 1 1 2019 7700 9.121000000000002 1 0 0 2010 9200 8.931666666666667 1 0 0 2011 9200 8.7475 1 0 0 2012 9200 8.623333333333335 1 0 0 2013 9200 8.1325 1 0 0 2014 9200 7.823333333333333 1 0 0 2015 9200 7.217499999999999 1 0 0 2016 9200 6.1191666666666675 1 1 1 2017 9200 5.8725000000000005 1 1 1 2018 9200 5.758333333333335 1 1 1 2019 9200 end
Code:
. xtset code year Panel variable: code (strongly balanced) Time variable: year, 2010 to 2019 Delta: 1 unit . xtreg unemployed_rate treat post treatXpost i.year, fe note: treat omitted because of collinearity. note: 2019.year omitted because of collinearity. Fixed-effects (within) regression Number of obs = 60 Group variable: code Number of groups = 6 R-squared: Obs per group: Within = 0.8959 min = 10 Between = 0.0007 avg = 10.0 Overall = 0.4467 max = 10 F(10,44) = 37.86 corr(u_i, Xb) = 0.0002 Prob > F = 0.0000 ------------------------------------------------------------------------------ unemployed~e | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- treat | 0 (omitted) post | -2.733078 .2969419 -9.20 0.000 -3.331525 -2.134631 treatXpost | -.0794973 .2529775 -0.31 0.755 -.5893399 .4303453 | year | 2011 | -.3352424 .2443995 -1.37 0.177 -.8277972 .1573123 2012 | -.7084369 .2443995 -2.90 0.006 -1.200992 -.2158821 2013 | -.668298 .2443995 -2.73 0.009 -1.160853 -.1757432 2014 | -1.195798 .2443995 -4.89 0.000 -1.688353 -.7032432 2015 | -1.80302 .2443995 -7.38 0.000 -2.295575 -1.310465 2016 | -2.198437 .2443995 -9.00 0.000 -2.690992 -1.705882 2017 | -.1209722 .2443995 -0.49 0.623 -.613527 .3715825 2018 | -.1644444 .2443995 -0.67 0.505 -.6569992 .3281103 2019 | 0 (omitted) | _cons | 7.765937 .1728165 44.94 0.000 7.417648 8.114226 -------------+---------------------------------------------------------------- sigma_u | 1.2343438 sigma_e | .42331229 rho | .89476537 (fraction of variance due to u_i) ------------------------------------------------------------------------------ F test that all u_i=0: F(5, 44) = 85.02 Prob > F = 0.0000
- How can I see the effect of the treatment across cities? Namely, how can I see effect of the station opening on (code == 9200) | (code == 7700) | (code == 874) | (code == 240) separately?
- How can I see the effect of the treatment across cities and over time? This is, I want to explore the effect of the station opening on code == 9200 in 2010, 2011, ..., 2019 and on code == 7700 in 2010, 2011, ..., 2019...
Comment