Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Related to control variables choice in cross-sectional Difference in Differences regression

    Dear all,
    I am investigating the effect of the X policy passed in 2005 on women's employment outcomes (1 if employed in last 5 years/0 otherwise). I am using cross-sectional data collected before and after the program. pre-intervention (1995 & 2000) and post-intervention (2005 & 2016). The fact that I am observing different women in different years, years of birth is different. For example, I observe a year of birth 1977 in pre-policy (1995 2000) but no women were born in post-policy (2005 2016) so its value is zero. Is it possible to include a year of birth as a control variable? My worry is that I have just a few years of birth that is common across surveys. In general, is it possible to control for variables like that when dealing with cross-sectional data?

    Thank you so much for your answers in advance!

  • #2
    Tariku, I think the year of birth needs to be correctly defined. For each woman, regardless which wave of data she comes from, the "year of birth" variable should store her birth year. Same for every woman in post periods -- So it should be actual birth years instead of zero. For your case, I would control for survey year dummies and birth year dummies (equivalent to controlling for survey year dummies and age dummies). If you think including birth year dummies is too strong, you may instead control for birth year group dummies, say treating five years as a birth cohort, or linear trend of birth year (not recommended though).

    Comment


    • #3
      Dear Wang,
      Thank you so much. I think I was not clear in my previous post. I have a woman's year of birth in each survey round (both in pre and post). Should year of birth be observed the same across surveys? For example, should I see 1992 both in the pre and post-intervention period? When I "tab year_of_birth wave" I can see that some of the birth years have 0 values in the recent survey and some of them have 0 values in earlier rounds. That is because I am following different women and the year of birth need not be the same across surveys. See a toy example below.
      year of wave
      birth 2005 2011 2016 Total
      1992 209 0 0 209
      1993 2,121 0 0 2,121
      1994 1,967 0 0 1,967
      1995 1,870 0 0 1,870
      1996 1,901 0 0 1,901
      1997 1,793 0 0 1,793
      1998 0 946 0 946
      1999 0 2,396 0 2,396
      2000 0 2,547 0 2,547
      2001 0 2,142 0 2,142
      2002 0 2,262 0 2,262
      2003 0 1,361 682 2,043
      Last edited by Tariku Getaneh; 11 Nov 2021, 12:38.

      Comment


      • #4
        Sorry, if my previous tables were unclear.
        Last edited by Tariku Getaneh; 11 Nov 2021, 12:41.

        Comment


        • #5
          Tariku, I see your point now. It's not unusual that birth years have narrow common support or even fail to overlap when the survey years are far away from each other. The procedure remains unchanged -- birth year variables need to be sufficiently controlled.

          Comment


          • #6
            Dear Wang. Thank you. I control the birth year dummy variable for now instead of the birth year as a linear trend (it looks to me that SE rises when I do so).

            Comment


            • #7
              Originally posted by Tariku Getaneh View Post
              Dear Wang. Thank you. I control the birth year dummy variable for now instead of the birth year as a linear trend (it looks to me that SE rises when I do so).
              Wise choice. Rising SE is an inevitable cost.

              Comment

              Working...
              X