Creating time to treat period for event study analysis

Jenna Kerry

Join Date: Jan 2023

Posts: 44
#1

Creating time to treat period for event study analysis

07 Sep 2023, 05:33

Hello,

I am having an issue with my data. To explain it in simple language, I have created a simple example data set here:

Code:

* Example generated by -dataex-. For more info, type help dataex clear input float(Family_id Year had_new_kid) 1 2000 . 2 2000 . 3 2000 . 4 2000 2000 5 2000 . 1 2001 2001 2 2001 . 3 2001 . 4 2001 . 5 2001 . 1 2002 . 2 2002 . 3 2002 . 4 2002 . 5 2002 2002 end

Here I have a panel data set on five families and from year 2000 to 2002. Now, I have the year when they have their first kid. Now I want to run an event study analysis, where I have to create leads and lags period for the families after and from having the first kid. Therefore, I need to create time to treat variable. Like this:

Code:

* Example generated by -dataex-. For more info, type help dataex clear input float(Family_id Year had_new_kid time_to_treat) 1 2000 . 2001 2 2000 . . 3 2000 . . 4 2000 2000 2000 5 2000 . 2002 1 2001 2001 2001 2 2001 . . 3 2001 . . 4 2001 . 2000 5 2001 . 2002 1 2002 . 2001 2 2002 . . 3 2002 . . 4 2002 . 2000 5 2002 2002 2002 end

Can anyone please help me with the codes here? Also, what should I do with the families (like families 2 & 3) for whom the event of having kids never occurs.

Thank you!
Tags: None

Andrew Musau

Join Date: Oct 2014
Posts: 10274

07 Sep 2023, 07:13

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(Family_id Year had_new_kid)
1 2000    .
2 2000    .
3 2000    .
4 2000 2000
5 2000    .
1 2001 2001
2 2001    .
3 2001    .
4 2001    .
5 2001    .
1 2002    .
2 2002    .
3 2002    .
4 2002    .
5 2002 2002
end

bys Family_id (had_new_kid): gen wanted= had_new_kid[1]

Res.:

Code:

. sort Year Family_id

. l, sep(0)

     +-------------------------------------+
     | Family~d   Year   had_ne~d   wanted |
     |-------------------------------------|
  1. |        1   2000          .     2001 |
  2. |        2   2000          .        . |
  3. |        3   2000          .        . |
  4. |        4   2000       2000     2000 |
  5. |        5   2000          .     2002 |
  6. |        1   2001       2001     2001 |
  7. |        2   2001          .        . |
  8. |        3   2001          .        . |
  9. |        4   2001          .     2000 |
 10. |        5   2001          .     2002 |
 11. |        1   2002          .     2001 |
 12. |        2   2002          .        . |
 13. |        3   2002          .        . |
 14. |        4   2002          .     2000 |
 15. |        5   2002       2002     2002 |
     +-------------------------------------+

.

Also, what should I do with the families (like families 2 & 3) for whom the event of having kids never occurs.

I am not sure. What is your outcome variable? If it is time to having kids, then such observations may be treated as right-censored as some will eventually have kids (you just have not observed this event). You may want to check out what other studies have done in the past.

Last edited by Andrew Musau; 07 Sep 2023, 07:46.

Comment

Jenna Kerry

Join Date: Jan 2023

Posts: 44
#3

07 Sep 2023, 17:37

Thank you so much for the code! It worked perfectly! I really appreciate the help.

My outcome variable is the family income, and I want to see how it gets affected before and after having a kid. There are some families in my data set, like families 2 & 3, who never have kids throughout the period. Therefore, I wondered if I should drop these families from my data set.

Thanks!
Comment
Jenna Kerry

Join Date: Jan 2023

Posts: 44
#4

07 Sep 2023, 18:20

Dear Andrew Musau,

I might sound silly, but for my future help, could you please explain why you put 1 in the third bracket in your command?

Here,

Code:

bys Family_id (had_new_kid): gen wanted= had_new_kid[1]

Thank you!
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10274
#5

08 Sep 2023, 03:17

My outcome variable is the family income, and I want to see how it gets affected before and after having a kid. There are some families in my data set, like families 2 & 3, who never have kids throughout the period.

If you were looking at a count, e.g., number of kids, you could include families with no kids. But since you state that the variable is "time to first kid", then there is no way to calculate this value for families without kids. You can drop them, but do mention that you have done so and include both the frequency and percentage of such families.

could you please explain why you put 1 in the third bracket in your command?

The following reads like this:

bys Family_id (had_new_kid): gen wanted= had_new_kid[1]

Construct groups of "Family_id" and sort by "had_new_kid". As "had_new_kid" is a numerical variable, Stata sorts from smallest to largest. So with the command

gen wanted= had_new_kid[1]

I am instructing Stata for each Family_id group, pick the first sorted value of "had_new_kid", which is referenced as "had_new_kid[1]". The second sorted value is "had_new_kid[2]", the last sorted value is "had_new_kid[_N]". So this guarantees that I have the first time the family had a kid as this is the earliest year that a kid in the family was born.
1 like
Comment
Jenna Kerry

Join Date: Jan 2023

Posts: 44
#6

08 Sep 2023, 05:54

Thank you so much for your help, Andrew Musau!
Comment
Jenna Kerry

Join Date: Jan 2023

Posts: 44
#7

07 Oct 2023, 04:29

Hello!

I am having another problem here! Please have a look at the data set below; notice that family_id 4 had kid again at year 2003. Can the previous code still generate the year when they have their first kid? which would still be 2000. In this case I tried the code mentioned above, it worked! However, I still need to determine the families like 4 here, which had kids more than once.

Code:

* Example generated by -dataex-. For more info, type help dataex clear input float(Family_id Year had_new_kid) 1 2000 . 2 2000 . 3 2000 . 4 2000 2000 5 2000 . 1 2001 2001 2 2001 . 3 2001 . 4 2001 . 5 2001 . 1 2002 . 2 2002 . 3 2002 . 4 2002 . 5 2002 2002 1 2003 . 2 2003 . 3 2003 . 4 2003 2003 5 2003 . end

Could anyone please help?

Thank you so much!
Comment

Announcement

Creating time to treat period for event study analysis

Comment

Comment

Comment

Comment

Comment

Comment