Difference in Differences model with more than two periods of treatment in panel data

Michael Loden

Join Date: Feb 2020

Posts: 2
#16

28 Feb 2020, 05:13

I've read the paper and it is indeed helpful in clearing up my confusion regarding the approach, thank you Clyde.
Comment
Kumari Gunjan

Join Date: Jun 2021

Posts: 18
#17

14 Feb 2022, 23:20

Dear sir,
My question is regarding the same. I want to check the well-being(consumption) of workers who shifts the employment vs those who remain in same job. In the base period, all were in same job. but later on, in time period 1, some changed and some remain in same profession. In the next period -3, some other changed and some who changed in earlier period might come back in base period- profession. I want to study, how changing job affects the well-being. But, I am confused how to model this question. Since, my treatment status is not same, I can't use the event study regression. I saw your suggestion for "Generalised DID". I tried the specification as suggested by you - Yit = b+ i.time##i.shift_job. Since, in the base period, everyone is in the same job, it drop the base period. Kindly, suggest some way out to me.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30121
#18

15 Feb 2022, 10:36

I cannot give you concrete advice without seeing example data, posted using the -dataex- command, and the actual Stata command(s) you tried (not the model equations).

If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
Comment
Kumari Gunjan

Join Date: Jun 2021

Posts: 18
#19

07 Mar 2022, 05:05

clear
input float log_ind_income long WAVE float switched
9.867005 0 0
10.32295 1 0
10.24228 2 0
9.942576 3 0
10.19565 4 0
10.204263 5 0
10.230142 6 0
9.752444 0 0
9.486429 2 1
9.774052 3 0
10.022263 4 0
9.615806 6 0
10.276204 0 0
10.072647 1 1
9.72383 4 1
10.238384 5 1
10.230142 6 1
9.525778 0 0
9.781432 0 0
. 1 1
7.383214 2 1
9.307158 3 1
10.068256 4 0
9.469057 5 0
. 6 1
9.116455 0 0
. 1 1
. 2 1
. 3 1
. 4 1
. 5 1
. 6 1
10.397182 0 0
. 1 0
. 4 0
9.889215 6 0
9.183381 0 0
8.714523 1 0
8.587616 2 1
9.007412 3 0
9.16603 4 0
9.088879 5 0
8.908385 6 0
10.910167 0 0
. 1 0
. 2 1
. 3 1
. 4 1
. 5 1
. 6 1
10.19075 0 0
9.467365 1 0
9.070631 2 1
9.024936 3 1
9.221911 4 1
8.864408 5 1
9.226839 6 1
9.768724 0 0
9.699026 5 0
9.824677 6 0
8.5866 0 0
. 1 0
8.654228 4 0
. 5 1
8.116597 0 0
. 1 0
. 0 0
10.1829 0 0
10.42932 1 0
10.795927 3 0
10.94434 4 0
10.803442 5 0
11.096308 6 0
9.999537 0 0
9.29857 0 0
8.595159 1 0
. 2 1
. 3 1
. 4 1
. 5 1
. 6 1
9.746426 0 0
9.062272 1 0
9.031853 2 1
9.457217 3 0
9.656569 4 0
9.745328 5 0
9.824677 6 0
10.336708 0 0
9.256521 2 1
9.79359 3 0
10.015662 5 0
9.755684 6 1
10.218925 0 0
8.630553 2 1
. 3 1
9.782558 5 0
9.196068 6 1
8.837342 0 0
8.792288 1 0
end
label values WAVE WAVE
label def WAVE 0 "Q3_19", modify
label def WAVE 1 "Q1_20", modify
label def WAVE 2 "Q2_20", modify
label def WAVE 3 "Q3_20", modify
label def WAVE 4 "Q1_21", modify
label def WAVE 5 "Q2_21", modify
label def WAVE 6 "Q3_21", modify
[/CODE]

Sir, This is my dataset. I am running all the suggestions given in the helpfile of regression with factor variable. However, regression takes the last quarter i.e. Q3_21 as base for the interaction term. It would be great help if you can suggest some commands which can solve this issue.

xtreg log_ind_income i.WAVE i.switched#ib(0).WAVE, fe
xtreg log_ind_income i.WAVE i.switched#i.WAVE, fe
xtreg log_ind_income i.WAVE ib(first).WAVE#i.ever_switched_informal, fe

. xtreg log_ind_income i.WAVE ib(first).WAVE#i.switched, fe
note: 6.WAVE#1.ever_switched omitted because of collinearity

Fixed-effects (within) regression Number of obs = 113,118
Group variable: psid Number of groups = 28,396

R-sq: Obs per group:
within = 0.1392 min = 1
between = 0.0083 avg = 4.0
overall = 0.0610 max = 7

F(12,84710) = 1141.44
corr(u_i, Xb) = 0.0025 Prob > F = 0.0000

---------------------------------------------------------------------------------------------
log_ind_income | Coef. Std. Err. t P>|t| [95% Conf. Interval]
----------------------------+----------------------------------------------------------------
WAVE |
Q1_20 | -.11381 .0102233 -11.13 0.000 -.1338476 -.0937724
Q2_20 | -.1931344 .0127943 -15.10 0.000 -.2182111 -.1680577
Q3_20 | -.1541956 .0101352 -15.21 0.000 -.1740605 -.1343307
Q1_21 | -.1137085 .0099043 -11.48 0.000 -.1331209 -.0942961
Q2_21 | -.1585569 .0102447 -15.48 0.000 -.1786365 -.1384774
Q3_21 | -.1407327 .0097173 -14.48 0.000 -.1597784 -.1216869
|
WAVE#switchedl |
Q3_19#1 | .4143967 .0136839 30.28 0.000 .3875764 .441217
Q1_20#1 | .0486465 .016018 3.04 0.002 .0172514 .0800416
Q2_20#1 | -.305196 .0178267 -17.12 0.000 -.3401361 -.2702558
Q3_20#1 | -.2506576 .0150629 -16.64 0.000 -.2801807 -.2211345
Q1_21#1 | -.2067041 .0151652 -13.63 0.000 -.2364278 -.1769804
Q2_21#1 | -.1847236 .0153852 -12.01 0.000 -.2148786 -.1545687
Q3_21#1 | 0 (omitted)
|
_cons | 9.668221 .0081596 1184.88 0.000 9.652228 9.684214
----------------------------+----------------------------------------------------------------
sigma_u | .79623355
sigma_e | .64049582
rho | .60713852 (fraction of variance due to u_i)
---------------------------------------------------------------------------------------------
F test that all u_i=0: F(28395, 84710) = 4.74 Prob > F = 0.0000
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30121
#20

07 Mar 2022, 12:23

Try

Code:

xtreg log_ind_income i.switched##ib0.WAVE, fe
Comment
Ondrej Dvoulety

Join Date: Jul 2017

Posts: 23
#21

26 Oct 2022, 03:47

Dear Colleagues,
I am struggling in my panel dataset to calculate variables reflecting differences in means over different time periods.
I have my ID (ICO) and YEAR (ROK) continuous outcome variables, and I need to calculate a new variable reflecting the differences in the mean values of years (2013-2018) and mean values of years (2007-2012).
I tried to use the following codes, but they did not work well - the differences in my mean outcome variables (ROA) provide empty values because there are different time windows. Any clue how to overcome this problem?
Thanks a lot for your help and suggestions,

Ondřej

bys ICO: egen ROA_1318 = mean(ROA) if inrange(ROK,2013,2018)
by ICO: egen ROA_1318 = mean(ROA) if ROK > 2012 & ROK < 2019
by ICO: egen ROA_0812 = mean(ROA) if ROK > 2007 & ROK < 2013
gen ROA_DIF = ROA_1318 - ROA_0812
egen ROA_DIF = rowtotal(ROA_1318 ROA_0812_m)
replace ROA_DIF = . if ROA_DIF==0

bys ICO: egen ROA_1318 = mean(ROA) if inrange(ROK,2013,2018)
bys ICO: egen ROA_0812 = mean(ROA) if inrange(ROK,2007,2012)
by ICO: gen ROA_DIF = ROA_1318 - ROA_0812

Example of my panel data structure (continuous variables missing)...

ROK Sektor ICO NAZEV CONTINUOUS VARIABLES .....
2003 2 205 Vojenské lesy a statky ČR, s.p.
2004 2 205 Vojenské lesy a statky ČR, s.p.
2005 2 205 Vojenské lesy a statky ČR, s.p.
2006 2 205 Vojenské lesy a statky ČR, s.p.
2007 2 205 Vojenské lesy a statky ČR, s.p.
2008 2 205 Vojenské lesy a statky ČR, s.p.
2009 2 205 Vojenské lesy a statky ČR, s.p.
2010 2 205 Vojenské lesy a statky ČR, s.p.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30121
#22

26 Oct 2022, 10:38

As best I can discern from your post, you have panel data. Consistent with the panel data layout, it probably makes more sense to create a variable distinguishing the two eras, 2007-2012 and 2013-2018,and then have a single variable giving the mean for each ICO in each era. It is then a simple matter to calculate another variable that gives the difference between the means. So the code for that would be:

Code:

label define era 1 "2007-2012" 2 "2013-2018" assert inrange(ROK, 2007, 2018) gen byte era:era = cond(inrange(ROK, 2007, 2012), 1, 2) by ICO era (ROK), sort: egen mean_ROA = mean(ROA) by ICO era (ROK): gen diff_ROA = mean_ROA[_N] - mean_ROA[1]

I appreciate your good intentions in showing the example data you did. But does it really make sense to omit the continuous variables from a data example when you are specifically looking for help on coding a calculation involving precisely those variables? Also, to make example data easy to use for those who want to help you, you should use the -dataex- command. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

My code proposed above relies on certain assumptions about your data, such as that the year (ROK) variable is never missing and always falls between 2007 and 2018. It also assumes that your continuous variables are actually stored as numeric variables, not strings that read like numbers to human eyes but are non-numeric as far as Stata is concerned. It would not have been necessary to guess about these matters had -dataex- been used. Suffice it to say, the above code will break if these assumptions I have guessed are incorrect.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment