Hello,
I want to use the twoway fixed effects estimator by de Chaisemartin, C and D'Haultfoeuille (https://www.aeaweb.org/articles?id=10.1257/aer.20181169). Specifically I want to use a DiD approach to analyse how the the change of a league format from a non franchise system to a franchise system in e-sports (competition with video games) affected the competition for players.
Below is an example of my data.
------------------ copy up to and including the previous line ------------------
My variable of interest is prob_leave, the probability that a player will stop playing, franchise is my treatment variable, hy is a half yearly time variable and region_n transforms my region variable into a numeric categorial variable.
I also created dummy variables for my time variable (hy_dummy1-hy_dummy19), for my region variable (reg_dummy1-reg_dummy4) and for the my region and time variable together (_IhyXReg_106_1-_IhyXReg_124_4), which I will omit from posting.
Now I am wondering how to correctly include time and region fixed effects into my Stata command.
The syntax of the did_multiplegt command is as follows:
. The groups, G, in my estimations refer to the regions.
I used the following variations of did_multiplegt. I am not sure if I need to use the dummy variables for time or region as control and in which option specifically I need to include them.
First off, I included the dummies created by the time variable and region variable into the controls option. (The error message can be ignored for now)
Now instead, I included the dummies of hy and region separately into the controls option.
Lastly, I included the control dummies into the
option. I tried this option, because the description of it states:
Again including my the dummies _IhyXReg_106_1 - _IhyXReg_124_4 first, I get:
And including the dummies for region and year separately into trends_lin, I get:
The coefficients are all very similar except for the first estimation. But again I am unaware how to include region and time fixed effects correctly.
I want to use the twoway fixed effects estimator by de Chaisemartin, C and D'Haultfoeuille (https://www.aeaweb.org/articles?id=10.1257/aer.20181169). Specifically I want to use a DiD approach to analyse how the the change of a league format from a non franchise system to a franchise system in e-sports (competition with video games) affected the competition for players.
Below is an example of my data.
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float id str12 Player float(hy franchise) str3 Region float region_n double KDA float prob_leave 1 "" 106 . "" . . .8421053 1 "" 107 . "" . . .8421053 1 "" 108 . "" . . .8421053 1 "" 109 . "" . . .8421053 1 "" 110 . "" . . .8421053 1 "" 111 . "" . . .8421053 1 "" 112 . "" . . .8421053 1 "1ntruder" 113 0 "CN" 3 4.3 .8421053 1 "1ntruder" 114 0 "CN" 3 1.8 .8421053 1 "" 115 . "" . . .8421053 1 "1ntruder" 116 1 "CN" 3 3 .8421053 1 "" 117 . "" . . .8421053 1 "" 118 . "" . . .8421053 1 "" 119 . "" . . .8421053 1 "" 120 . "" . . .8421053 1 "" 121 . "" . . .8421053 1 "" 122 . "" . . .8421053 1 "" 123 . "" . . .8421053 1 "" 124 . "" . . .8421053 2 "" 106 . "" . . .6315789 2 "" 107 . "" . . .6315789 2 "" 108 . "" . . .6315789 2 "" 109 . "" . . .6315789 2 "" 110 . "" . . .6315789 2 "" 111 . "" . . .6315789 2 "" 112 . "" . . .6315789 2 "" 113 . "" . . .6315789 2 "" 114 . "" . . .6315789 2 "" 115 . "" . . .6315789 2 "" 116 . "" . . .6315789 2 "" 117 . "" . . .6315789 2 "369" 118 1 "CN" 3 3.4 .6315789 2 "369" 119 1 "CN" 3 3.2 .6315789 2 "369" 120 1 "CN" 3 3.5 .6315789 2 "369" 121 1 "CN" 3 4 .6315789 2 "369" 122 1 "CN" 3 3.9 .6315789 2 "369" 123 1 "CN" 3 3.1 .6315789 2 "369" 124 1 "CN" 3 3.3 .6315789 3 "" 106 . "" . . .9473684 3 "" 107 . "" . . .9473684 3 "" 108 . "" . . .9473684 3 "" 109 . "" . . .9473684 3 "" 110 . "" . . .9473684 3 "" 111 . "" . . .9473684 3 "" 112 . "" . . .9473684 3 "" 113 . "" . . .9473684 3 "" 114 . "" . . .9473684 3 "" 115 . "" . . .9473684 3 "" 116 . "" . . .9473684 3 "" 117 . "" . . .9473684 3 "" 118 . "" . . .9473684 3 "" 119 . "" . . .9473684 3 "" 120 . "" . . .9473684 3 "" 121 . "" . . .9473684 3 "" 122 . "" . . .9473684 3 "5kid" 123 1 "KR" 4 5.1 .9333333 3 "" 124 . "" . . .9473684 4 "" 106 . "" . . .8947368 4 "" 107 . "" . . .8947368 4 "" 108 . "" . . .8947368 4 "" 109 . "" . . .8947368 4 "" 110 . "" . . .8947368 4 "" 111 . "" . . .8947368 4 "" 112 . "" . . .8947368 4 "" 113 . "" . . .8947368 4 "" 114 . "" . . .8947368 4 "" 115 . "" . . .8947368 4 "" 116 . "" . . .8947368 4 "" 117 . "" . . .8947368 4 "" 118 . "" . . .8947368 4 "" 119 . "" . . .8947368 4 "705" 120 1 "CN" 3 2.3 .8947368 4 "705" 121 1 "CN" 3 1.5 .8947368 4 "" 122 . "" . . .8947368 4 "" 123 . "" . . .8947368 4 "" 124 . "" . . .8947368 5 "" 106 . "" . . .6315789 5 "" 107 . "" . . .6315789 5 "" 108 . "" . . .6315789 5 "" 109 . "" . . .6315789 5 "" 110 . "" . . .6315789 5 "" 111 . "" . . .6315789 5 "957" 112 0 "CN" 3 3.6 .6315789 5 "957" 113 0 "CN" 3 4.7 .6315789 5 "957" 114 0 "CN" 3 4.4 .6315789 5 "957" 115 1 "CN" 3 4.2 .6315789 5 "957" 116 1 "CN" 3 4.3 .6315789 5 "957" 117 1 "CN" 3 2.8 .6315789 5 "" 118 . "" . . .6315789 5 "957" 119 1 "CN" 3 1.3 .6315789 5 "" 120 . "" . . .6315789 5 "" 121 . "" . . .6315789 5 "" 122 . "" . . .6315789 5 "" 123 . "" . . .6315789 5 "" 124 . "" . . .6315789 6 "" 106 . "" . . .5263158 6 "" 107 . "" . . .5263158 6 "" 108 . "" . . .5263158 6 "" 109 . "" . . .5263158 6 "" 110 . "" . . .5263158 end format %th hy
My variable of interest is prob_leave, the probability that a player will stop playing, franchise is my treatment variable, hy is a half yearly time variable and region_n transforms my region variable into a numeric categorial variable.
I also created dummy variables for my time variable (hy_dummy1-hy_dummy19), for my region variable (reg_dummy1-reg_dummy4) and for the my region and time variable together (_IhyXReg_106_1-_IhyXReg_124_4), which I will omit from posting.
Now I am wondering how to correctly include time and region fixed effects into my Stata command.
The syntax of the did_multiplegt command is as follows:
Code:
did_multiplegt Y G T D
I used the following variations of did_multiplegt. I am not sure if I need to use the dummy variables for time or region as control and in which option specifically I need to include them.
First off, I included the dummies created by the time variable and region variable into the controls option. (The error message can be ignored for now)
Code:
did_multiplegt prob_leave region_n hy franchise, breps(50) controls(_IhyXReg_106_1 - _IhyXReg_124_4 ) cluster(id) In some bootstrap replications, the command had to run regressions with strictly more control variables than the sample size, so the controls could not all be accounted for. If you want to solve this problem, you may reduce the number of control variables. You may also use the recat_treatment option to discretize your treatment. Finally, you could reduce the number of placebos and/or dynamic effects requested. In the main estimation, the command had to run regressions with strictly more control variables than the sample size, so the controls could not all be accounted for. If you want to solve this problem, you may reduce the number of control variables. You may also use the recat_treatment option to discretize your treatment. Finally, you could reduce the number of placebos and/or dynamic effects requested. DID estimators of the instantaneous treatment effect, of dynamic treatment effects if the dynamic option is used, and of placebo tests of the parallel trends assumption if the placebo option is used. The estimators are robust to heterogeneous effects, and to dynamic effects if the robust_dynamic option is used. | Estimate SE LB CI UB CI N Switchers -------------+------------------------------------------------------------------ Effect_0 | -.0547876 .0155836 -.0853314 -.0242438 588 210
Code:
did_multiplegt prob_leave region_n hy franchise, breps(50) controls(KDA hy_dummy1-hy_dummy19 reg_dummy1-reg_dummy4 ) cluster(id) DID estimators of the instantaneous treatment effect, of dynamic treatment effects if the dynamic option is used, and of placebo tests of the parallel trends assumption if the placebo option is used. The estimators are robust to heterogeneous effects, and to dynamic effects if the robust_dynamic option is used. | Estimate SE LB CI UB CI N Switchers -------------+------------------------------------------------------------------ Effect_0 | -.0418293 .0180865 -.0772788 -.0063797 588 210
Lastly, I included the control dummies into the
Code:
trends_lin(varlist)
when this option is specified, fixed effects for each value of varlist are included as controls when residualizing the first-difference of the outcome. This is equivalent to allowing for varlist-specific linear trends.
Code:
. did_multiplegt prob_leave region_n hy franchise, breps(50) controls(KDA) trends_lin( _IhyXReg_106_1- _IhyXReg_124_4 ) cluster(id) DID estimators of the instantaneous treatment effect, of dynamic treatment effects if the dynamic option is used, and of placebo tests of the parallel trends assumption if the placebo option is used. The estimators are robust to heterogeneous effects, and to dynamic effects if the robust_dynamic option is used. | Estimate SE LB CI UB CI N Switchers -------------+------------------------------------------------------------------ Effect_0 | -.0410487 .0166562 -.0736949 -.0084026 588 210
And including the dummies for region and year separately into trends_lin, I get:
Code:
did_multiplegt prob_leave region_n hy franchise, breps(50) controls(KDA) trends_lin( reg_dummy1 reg_dummy2 reg_dummy3 reg_dummy4 hy_du > mmy1-hy_dummy19) cluster(id) DID estimators of the instantaneous treatment effect, of dynamic treatment effects if the dynamic option is used, and of placebo tests of the parallel trends assumption if the placebo option is used. The estimators are robust to heterogeneous effects, and to dynamic effects if the robust_dynamic option is used. | Estimate SE LB CI UB CI N Switchers -------------+------------------------------------------------------------------ Effect_0 | -.0419703 .0186222 -.0784699 -.0054707 588 210
Comment