Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • csdid only omitted variables

    Hello!

    I have some issues with the csdid command. I am currently trying to use it with my dataset but I only get the message that all coefficients are zero (ommitted) and number of observations = 0 as well.
    My dataset is a paneldataset (unbalanced) with a variable for the year, a personal identifier, income (used as outcome variable). My treatment is the year of birth of the child. Additional variables are sex, married and so on
    Not every individual is observed every year.
    My dataset does not contain any "."

    I tried using: csdid gross_income sex, cluster(pid) time(syear) gvar(birthyearchild) method(reg)
    birthyear child is defined as:
    gen birthyearchild = 0
    replace birthyearchild = syear if birth==1

    I am quite new to this estimation technique so there might be some mistakes, but I cannot find them.
    Any help or ideas on how I can improve the estimation would be greatly appreciated.

    Thanks in advance!

  • #2
    First thing first
    tabúlate year gvar

    Comment


    • #3
      The result looks something like this:
      birthyear ranging from 1999 to 2020
      and surveyyear ranging from 1994 to 2020
      year 0 1999 2000 2001 2002 2003
      1998 9000 0 0 0 0 0
      1999 9000 400 0 0 0 0
      2000 14000 0 600 0 0 0
      2001 14000 0 0 700 0 0
      2002 15000 0 0 0 500 0
      2003 15000 0 0 0 0 550
      the other years before 1998 look the same and it continues after 2003
      In total, there are a little over 400,000 observations

      Comment


      • #4
        That’s your problem
        if you look at the example in helpfile you will see how the gvar year tabulation should look
        specifically you do not see units treated day in 2000 before that year
        hope this helps

        Comment


        • #5
          Hi.

          I am facing the same issue with my csdid command using panel data. I am running the command: csdid turnout_p, ivar(voting_area_id) time(year) gvar(gvar_zero) notyet,

          where turnout_p is turnout percentage in different voting areas, voting_area_id is the unique id (string) for each of the voting area I am observing in my data, year is the panel data time variable (I only have three different years) and gvar_zero represents the time when each voting area was first treated.

          My tabulation looks like this (I think it looks similar to the one provided in helpfile regarding csdid command):

          . tab gvar_zero year

          | year
          gvar_zero | 2012 2017 2021 | Total
          -----------+---------------------------------+----------
          0 | 13 13 13 | 39
          2012 | 226 226 226 | 678
          2017 | 22 22 22 | 66
          2021 | 8 8 8 | 24
          -----------+---------------------------------+----------
          Total | 269 269 269 | 807

          Even if I restrict my gvar_zero so that it can only get values 2017 or 2021 the same problem occurs.

          I am new to this command so basic mistakes are highly possible. Thanks in advance!

          Comment


          • #6
            Hi Jesse
            Your problem is different.
            CSDID expect the data to be equally spaced. Because it uses the space between years to determine what should be used as G-1. (Base period).
            In you case...you cannot estimate the effect for 2012 (no pre-treatment period)
            for 2017, you could use 2012 data (5 years previous), but then in 2021 you would have to use 2017 (last treated year), which is 4 years before.
            This creates a conflict that explains your problem
            F

            Comment


            • #7
              Oh, now I see. In my case time between the years is almost similar but they are just because of the simplicity shown only as years. I can just change the way they are shown in my data and this fixed the problem. Thank you very much!

              Comment


              • #8
                Hello!
                I meet with similar problem when I run the command "quietly csdid lnqty $X $Z i.mmodel_dum i.city_dum i.year_month, time(ymonth) gvar(trtmonth) notyet".
                My research topic is how the opening of subway in a city affect vehicle sales. The data is at year-month-city-vehicle level. lnqty is the sales of each vehicle model in a city at each time period. $X and $Z are vehicle attributes and city characteristics respectively. i.mmodel_dum i.city_dum i.year_month represent vehicle model fixed effect, city fixed effect and year-month fixed effect. ymonth is the %tm format time variable indicating the year and month. trtmonth is the time when each city is treated. In the sample, there are no cities that are not treated.
                Not every vehicle model is observed in each time period.
                Part of the "tab ymonth trtmonth" looks like below (the whole table is too large since time spans from 2013m1 to 2015m12).
                trtmonth
                ymonth 2013m5 2013m6 2013m8 2013m9 2013m12 2014m4 Total

                2013m1 6,062 2,144 3,882 3,542 7,262 1,903 38,946
                2013m2 5,269 1,862 3,171 2,850 6,127 1,548 31,925
                2013m3 5,266 1,948 3,661 3,121 6,556 1,644 34,112
                2013m4 5,183 1,964 3,739 3,275 6,581 1,691 34,553
                2013m5 5,293 1,993 3,763 3,437 6,897 1,665 35,644
                2013m6 5,083 1,941 3,619 2,948 6,551 1,686 33,847
                2013m7 5,391 1,949 3,765 3,371 6,828 1,644 35,395
                2013m8 5,392 1,973 3,726 3,347 6,766 1,705 35,182
                2013m9 5,501 2,088 3,741 3,339 6,899 1,725 36,168
                2013m10 5,436 2,035 3,758 3,309 6,737 1,664 35,368
                2013m11 5,636 2,065 3,774 3,340 6,949 1,755 36,429
                2013m12 5,603 2,116 3,745 3,210 7,215 1,863 36,945
                2014m1 6,172 2,278 4,088 3,782 7,836 1,976 40,911
                2014m2 5,293 2,021 2,783 2,936 6,372 1,725 32,394
                2014m3 5,658 2,132 3,150 3,347 7,021 1,750 35,777
                2014m4 5,924 2,273 3,479 3,686 7,452 1,797 37,320
                2014m5 6,036 2,269 3,285 3,679 7,648 1,854 38,684
                2014m6 6,015 2,305 3,395 3,615 7,526 1,900 37,850
                2014m7 6,235 2,271 3,589 3,656 7,536 1,843 38,317
                2014m8 5,870 2,193 3,557 3,571 7,355 1,774 37,050
                2014m9 6,171 2,472 3,663 3,781 7,827 1,941 40,196
                2014m10 6,011 2,258 3,578 3,663 7,440 1,837 38,206
                2014m11 6,201 2,231 3,763 3,592 7,629 1,869 39,007
                2014m12 6,292 2,292 3,644 3,485 8,068 2,021 39,873

                The command run smoothly, except that it returns to zero (omitted) ATT.
                Thanks in advance.

                Comment


                • #9
                  You cannot add year month fixed effects in themodel. That is causing the problems
                  And if you have repeated crossection, then I suggest you to move into csdid2 (from my site). csdid will do the work, but it may take some time to do all the work

                  Comment


                  • #10
                    Thank you very much for the suggestion! I will see if I can revise the specification.
                    Also thanks again for the great public goods of csdid and csdid2 command.

                    Comment


                    • #11
                      Originally posted by Yinxin Fei View Post
                      Hello!
                      I meet with similar problem when I run the command "quietly csdid lnqty $X $Z i.mmodel_dum i.city_dum i.year_month, time(ymonth) gvar(trtmonth) notyet".
                      My research topic is how the opening of subway in a city affect vehicle sales. The data is at year-month-city-vehicle level. lnqty is the sales of each vehicle model in a city at each time period. $X and $Z are vehicle attributes and city characteristics respectively. i.mmodel_dum i.city_dum i.year_month represent vehicle model fixed effect, city fixed effect and year-month fixed effect. ymonth is the %tm format time variable indicating the year and month. trtmonth is the time when each city is treated. In the sample, there are no cities that are not treated.
                      Not every vehicle model is observed in each time period.
                      Part of the "tab ymonth trtmonth" looks like below (the whole table is too large since time spans from 2013m1 to 2015m12).
                      trtmonth
                      ymonth 2013m5 2013m6 2013m8 2013m9 2013m12 2014m4 Total

                      2013m1 6,062 2,144 3,882 3,542 7,262 1,903 38,946
                      2013m2 5,269 1,862 3,171 2,850 6,127 1,548 31,925
                      2013m3 5,266 1,948 3,661 3,121 6,556 1,644 34,112
                      2013m4 5,183 1,964 3,739 3,275 6,581 1,691 34,553
                      2013m5 5,293 1,993 3,763 3,437 6,897 1,665 35,644
                      2013m6 5,083 1,941 3,619 2,948 6,551 1,686 33,847
                      2013m7 5,391 1,949 3,765 3,371 6,828 1,644 35,395
                      2013m8 5,392 1,973 3,726 3,347 6,766 1,705 35,182
                      2013m9 5,501 2,088 3,741 3,339 6,899 1,725 36,168
                      2013m10 5,436 2,035 3,758 3,309 6,737 1,664 35,368
                      2013m11 5,636 2,065 3,774 3,340 6,949 1,755 36,429
                      2013m12 5,603 2,116 3,745 3,210 7,215 1,863 36,945
                      2014m1 6,172 2,278 4,088 3,782 7,836 1,976 40,911
                      2014m2 5,293 2,021 2,783 2,936 6,372 1,725 32,394
                      2014m3 5,658 2,132 3,150 3,347 7,021 1,750 35,777
                      2014m4 5,924 2,273 3,479 3,686 7,452 1,797 37,320
                      2014m5 6,036 2,269 3,285 3,679 7,648 1,854 38,684
                      2014m6 6,015 2,305 3,395 3,615 7,526 1,900 37,850
                      2014m7 6,235 2,271 3,589 3,656 7,536 1,843 38,317
                      2014m8 5,870 2,193 3,557 3,571 7,355 1,774 37,050
                      2014m9 6,171 2,472 3,663 3,781 7,827 1,941 40,196
                      2014m10 6,011 2,258 3,578 3,663 7,440 1,837 38,206
                      2014m11 6,201 2,231 3,763 3,592 7,629 1,869 39,007
                      2014m12 6,292 2,292 3,644 3,485 8,068 2,021 39,873

                      The command run smoothly, except that it returns to zero (omitted) ATT.
                      Thanks in advance.
                      Sorry I have another related question. Still using this background, can I add year fixed effect or month fixed effect or city fixed effect? I try to change the year-month fixed effect to year fixed effect and month fixed effect. Now there are four sets of fixed effects: car model fixed effect, city fixed effect, year fixed effect and month fixed effect. I still get the omitted result.

                      Comment


                      • #12
                        You can only add covariates that are observed across all treated groups. If a covariates is not present on even one treated group or. Calendar for all years, then you probably shouldn’t add that control to the model

                        Comment


                        • #13
                          Got it! Thank you very much!

                          Comment

                          Working...
                          X