Hello,
I know that the basic setup for a difference-in-differences estimation that involves two time periods and a control and treatment group is:
Y = B0 + B1(treatment) + B2(posttreatment) + B3 (treatment*posttreatment),
where treatment = 1 if the observation is in the treatment group, and posttreatment = 1 if the time period is after treatment -- with B3 being the parameter of interest/ estimate of the impact of the treatment.
I have 8 years of monthly consumption data (2007-2015) for 16,000 households. I split these households up into 6 treatment groups (binned by lawn size), and want to see the differential impact of a policy that occurred for a two year period (2009-2011), call the dummy "restrictions." I also want to see if there is some habit formation that occurs after the policy ends (as there are 4 years of data after it ends).
Is the correct way to do this to generate a dummy for each lawn size bin (6 treatment groups), then generate 6 interactions between each treatment group dummy and the "restrictions" dummy, and another 6 interactions with a "restrictions_off" dummy that equals 1 for the 4 year time period after the restrictions ended? If so, is this all in one regression, or do I estimate these separately for each lawn size grouping? I am also wondering if I need to have a reference group of some sort? Formally, my options (I believe) are:
A) Y = B0 + B1(treatment_1) + B2(restrictions) + B3(treatment_1*restrictions) + B4(treatment_2) + B5(treatment_2*restrictions) + B6(treatment_3) + B7(treatment_3*restrictions)...
or
B) For i = 1 to 6 separately: Y = B0 + B1(treatment_i) + B2(restrictions) + B3(treatment_i*restrictions)
C) And then run either A) or B) a "restrictions_off" dummy instead of a "restrictions" dummy, or D) include the "restrictions_off" dummy in one of the above?
If I can provide any more detail, please let me know. Thank you.
I know that the basic setup for a difference-in-differences estimation that involves two time periods and a control and treatment group is:
Y = B0 + B1(treatment) + B2(posttreatment) + B3 (treatment*posttreatment),
where treatment = 1 if the observation is in the treatment group, and posttreatment = 1 if the time period is after treatment -- with B3 being the parameter of interest/ estimate of the impact of the treatment.
I have 8 years of monthly consumption data (2007-2015) for 16,000 households. I split these households up into 6 treatment groups (binned by lawn size), and want to see the differential impact of a policy that occurred for a two year period (2009-2011), call the dummy "restrictions." I also want to see if there is some habit formation that occurs after the policy ends (as there are 4 years of data after it ends).
Is the correct way to do this to generate a dummy for each lawn size bin (6 treatment groups), then generate 6 interactions between each treatment group dummy and the "restrictions" dummy, and another 6 interactions with a "restrictions_off" dummy that equals 1 for the 4 year time period after the restrictions ended? If so, is this all in one regression, or do I estimate these separately for each lawn size grouping? I am also wondering if I need to have a reference group of some sort? Formally, my options (I believe) are:
A) Y = B0 + B1(treatment_1) + B2(restrictions) + B3(treatment_1*restrictions) + B4(treatment_2) + B5(treatment_2*restrictions) + B6(treatment_3) + B7(treatment_3*restrictions)...
or
B) For i = 1 to 6 separately: Y = B0 + B1(treatment_i) + B2(restrictions) + B3(treatment_i*restrictions)
C) And then run either A) or B) a "restrictions_off" dummy instead of a "restrictions" dummy, or D) include the "restrictions_off" dummy in one of the above?
If I can provide any more detail, please let me know. Thank you.
Comment