Dear Statalist,
I have 2 questions.
(1) I am trying to run a difference-in-difference regression, where the dependent variable is wealth, and the treatment is the death of both parents during my treatment time period. Due to the small sample size (735 people x 2 years, out of which 53 people are in the treatment group), my sample consists of people from 40 to 64 years of age, and my "before" (2006) and "after" (2014) time periods are 8 years apart.
I have 1 more time period (2002) to check for parallel trends for control and treatment groups before the treatment, and the group means of wealth levels indeed look parallel before and diverging after the treatment (sorry, cannot insert my very convincing graph from Word here). The group means of such variables as age, marital status, gender, immigrants look similar for the control and treatment groups. So, the setup for difference-in-difference estimation seems quite promising.
In my difference-in-difference estimation I use factor variable notations and try different specifications: ols with just 3 standard difference-in-difference regressors (i.after##i.treatment), fixed effect panel data regression, I also try to include additional controls in my ols or fixed effects (age, marital status, # siblings, race, immigrant dummy, etc.).
In all the cases my coefficient of interest, the interaction term i.after#i.treatment, is totally insignificant (the P-value is around 0.5, so no hope really) UNLESS I cluster my standard errors by the treatment dummy (which is 1 for treatment group and 0 for control). With clustering I receive small standard errors and, hence, a significant coefficient of interest. I thought this was legitimate after reading an opinion that clustering should reflect the way we sample. Since this is exactly how I sampled my observations, based on the fact whether they have been treated (both parents died) or not (the last parent is still alive), I thought using clustering in my case was valid.
Recently, however, I started reading Mostly Harmless Econometrics, and in this book they say that clustering is only valid if the number of clusters is large enough, and in my case I have only 2 clusters.
On the other hand, I read an opinion here, on Statalist, that interaction terms have less "power" than other regressors in terms of statistical significance.
I really appreciate your thoughts on this problem.
(2) I also have another smaller question: Given my small sample size (a panel of 2 years and 735 people, with 53 people in the treatment group and 682 in control, ), can I legitimately do quantile regression difference-in-difference estimation, in other words, run the same regression as above in question (1) for the median and deciles? Or does my small sample size warrant against it?
Many thanks
I have 2 questions.
(1) I am trying to run a difference-in-difference regression, where the dependent variable is wealth, and the treatment is the death of both parents during my treatment time period. Due to the small sample size (735 people x 2 years, out of which 53 people are in the treatment group), my sample consists of people from 40 to 64 years of age, and my "before" (2006) and "after" (2014) time periods are 8 years apart.
I have 1 more time period (2002) to check for parallel trends for control and treatment groups before the treatment, and the group means of wealth levels indeed look parallel before and diverging after the treatment (sorry, cannot insert my very convincing graph from Word here). The group means of such variables as age, marital status, gender, immigrants look similar for the control and treatment groups. So, the setup for difference-in-difference estimation seems quite promising.
In my difference-in-difference estimation I use factor variable notations and try different specifications: ols with just 3 standard difference-in-difference regressors (i.after##i.treatment), fixed effect panel data regression, I also try to include additional controls in my ols or fixed effects (age, marital status, # siblings, race, immigrant dummy, etc.).
In all the cases my coefficient of interest, the interaction term i.after#i.treatment, is totally insignificant (the P-value is around 0.5, so no hope really) UNLESS I cluster my standard errors by the treatment dummy (which is 1 for treatment group and 0 for control). With clustering I receive small standard errors and, hence, a significant coefficient of interest. I thought this was legitimate after reading an opinion that clustering should reflect the way we sample. Since this is exactly how I sampled my observations, based on the fact whether they have been treated (both parents died) or not (the last parent is still alive), I thought using clustering in my case was valid.
Recently, however, I started reading Mostly Harmless Econometrics, and in this book they say that clustering is only valid if the number of clusters is large enough, and in my case I have only 2 clusters.
On the other hand, I read an opinion here, on Statalist, that interaction terms have less "power" than other regressors in terms of statistical significance.
I really appreciate your thoughts on this problem.
(2) I also have another smaller question: Given my small sample size (a panel of 2 years and 735 people, with 53 people in the treatment group and 682 in control, ), can I legitimately do quantile regression difference-in-difference estimation, in other words, run the same regression as above in question (1) for the median and deciles? Or does my small sample size warrant against it?
Many thanks
Comment