Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • DID Callaway and Sant'Anna not working

    Hello statausers,

    First time doing DiD and first time applying Callaway & Sant'Anna (2021) command. I'm trying to estimate the effect of a reform (with a staggered rollout) on school enrollment using pooled cross-section and a linear probability model. After reading all the instructions regarding how to use the command properly, I obtain the following outcome (see the photo). I don't know if this is because I'm defining wrongly the gvar. Also, I would like to know if you see any other mistakes in my econometric model. Thank you so much.

    csdid cschoolenroll allcontrols i.region i.age i.year, time(year) gvar(first_treated) method(drimp) wboot rseed(1)

    Note: there are two treatment groups (+ never-treated), one get treated in 2005 and the other in 2011.

    Click image for larger version

Name:	image_27646.png
Views:	1
Size:	24.5 KB
ID:	1668968
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str12 hhid float girlid byte region float(year first_treated)
    "       25269" 5555 12 2011 2011
    "       25642" 5560 12 2011 2011
    "       25508" 5557 12 2011 2011
    "       25521" 5558 12 2011 2011
    "       25573" 5559 12 2011 2011
    "       25 31" 5554 12 2011 2011
    "       25370" 5556 12 2011 2011
    "      454 10" 7505 12 2011 2011
    "      447540" 7488 12 2011 2011
    "      447572" 7489 12 2011 2011
    "      447605" 7490 12 2011 2011
    "      470543" 7578 12 2011 2011
    "      470543" 7579 12 2011 2011
    "      158371" 6155 12 2011 2011
    "      158144" 6153 12 2011 2011
    "      158144" 6152 12 2011 2011
    "      158407" 6156 12 2011 2011
    "      158232" 6154 12 2011 2011
    "      158 33" 6151 12 2011 2011
    "      399  5" 1968 12 2000 2011
    "      399  1" 1967 12 2000 2011
    "      399 35" 1969 12 2000 2011
    "      401  1" 1976 12 2000 2011
    "      401 12" 1977 12 2000 2011
    "      394 10" 1952 12 2000 2011
    "      394 30" 1953 12 2000 2011
    "      394  4" 1949 12 2000 2011
    "      394  6" 1951 12 2000 2011
    "      394  4" 1950 12 2000 2011
    "      161391" 6166 12 2011 2011
    "      161391" 6167 12 2011 2011
    "      161 42" 6164 12 2011 2011
    "      161108" 6165 12 2011 2011
    "      161822" 6168 12 2011 2011
    "      401175" 7266 12 2011 2011
    "      401629" 7267 12 2011 2011
    "      410268" 4821  2 2005    0
    "      410268" 4820  2 2005    0
    "      410679" 4823  2 2005    0
    "      410460" 4822  2 2005    0
    "      547837" 7915 12 2011 2011
    "      547321" 7912 12 2011 2011
    "      547837" 7916 12 2011 2011
    "      547555" 7913 12 2011 2011
    "      547571" 7914 12 2011 2011
    "      397 65" 1962 12 2000 2011
    "      397 68" 1963 12 2000 2011
    "      492674" 7681 12 2011 2011
    "      492567" 7680 12 2011 2011
    "      492674" 7682 12 2011 2011
    "      492519" 7679 12 2011 2011
    "      492853" 7683 12 2011 2011
    "      391 72" 1933 12 2000 2011
    "       59419" 3122  2 2005    0
    "       59419" 3121  2 2005    0
    "       59103" 3120  2 2005    0
    "      389 19" 1927 12 2000 2011
    "      389 39" 1929 12 2000 2011
    "      389 95" 1932 12 2000 2011
    "      389 67" 1930 12 2000 2011
    "      389 39" 1928 12 2000 2011
    "      389 91" 1931 12 2000 2011
    "      398  5" 1964 12 2000 2011
    "      398 69" 1966 12 2000 2011
    "      398 53" 1965 12 2000 2011
    "       63792" 5743 12 2011 2011
    "       63272" 5733 12 2011 2011
    "       63693" 5738 12 2011 2011
    "       63272" 5732 12 2011 2011
    "       63609" 5735 12 2011 2011
    "       63272" 5731 12 2011 2011
    "       63770" 5741 12 2011 2011
    "       63770" 5740 12 2011 2011
    "       63 20" 5730 12 2011 2011
    "       63792" 5742 12 2011 2011
    "       63363" 5734 12 2011 2011
    "       63675" 5737 12 2011 2011
    "       63636" 5736 12 2011 2011
    "       63770" 5739 12 2011 2011
    "      392 10" 1934 12 2000 2011
    "      392 28" 1938 12 2000 2011
    "      392 43" 1940 12 2000 2011
    "      392 19" 1936 12 2000 2011
    "      392 25" 1937 12 2000 2011
    "      392 28" 1939 12 2000 2011
    "      392 13" 1935 12 2000 2011
    "      403 65" 1981 12 2000 2011
    "      403 71" 1982 12 2000 2011
    "      315291" 4314 12 2005 2011
    "      315600" 4315 12 2005 2011
    "      315169" 4313 12 2005 2011
    "      315746" 4316 12 2005 2011
    "      645431" 8367 15 2011 2005
    "      645569" 8368 15 2011 2005
    "      385733" 4706  2 2005    0
    "      385590" 4705  2 2005    0
    "      385539" 4704  2 2005    0
    "      385422" 4703  2 2005    0
    "      385304" 4702  2 2005    0
    "      386 13" 1919 12 2000 2011
    end
    label values region hv024
    label def hv024 2 "affar", modify
    label def hv024 12 "gambela", modify
    label def hv024 15 "dire dawa", modify
    Last edited by Daniel Perez Parra; 13 Jun 2022, 04:46.

  • #2
    Well, I have just realized that dropping out time-varying controls (such as household wealth) the command works. Also, some time-varying controls reduce significantly the number of observations I don't understand why though. Any idea?

    Comment


    • #3
      Don't put your output in a picture. We can't see it.


      Do this. See how it's much easier to read now? Post the output you got back in code delimiters, no screenshots, no pictures.



      Edit: do you literally have data for only two years?

      Comment


      • #4
        Originally posted by Jared Greathouse View Post
        Don't put your output in a picture. We can't see it.


        Do this. See how it's much easier to read now? Post the output you got back in code delimiters, no screenshots, no pictures.



        Edit: do you literally have data for only two years?
        Thanks for your response, Jared. I hope you can visualize it better now.

        Considering that there are two treatment groups (2005 and 2011) I have included data from 2000, 2005, and 2011 (all the available years in that period). However, I think that as long as I use never-treated as a comparison group using CS I need to include also 2016 (I've just realized). Sadly I just have data from two periods (that would be 3 if I include 2016) because this methodology wouldn't take into account 2000 (am I right?).


        Code:
        Difference-in-difference with Multiple Time Periods
        
                                                        Number of obs     =          0
        Outcome model: weighted least squares
        Treatment model: inverse probability tilting
                                        (Std. Err. adjusted for 11 clusters in region)
        ------------------------------------------------------------------------------
                     |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
        g2005        |
         t_2000_2005 |          0  (omitted)
         t_2000_2011 |          0  (omitted)
        -------------+----------------------------------------------------------------
        g2011        |
         t_2000_2005 |          0  (omitted)
         t_2005_2011 |          0  (omitted)
        ------------------------------------------------------------------------------
        Control: Never Treated
        
        See Callaway and Sant'Anna (2021) for details
        Last edited by Daniel Perez Parra; 14 Jun 2022, 05:49.

        Comment


        • #5
          MUCH MUCH MUCH better. Can you post the code you used to produce this table, too? Perhaps the author FernandoRios will have insights I don't have, although I dohave a few ideas.

          Comment


          • #6
            Originally posted by Jared Greathouse View Post
            MUCH MUCH MUCH better. Can you post the code you used to produce this table, too? Perhaps the author FernandoRios will have insights I don't have, although I dohave a few ideas.
            Great. Thanks a lot for the advice, I'll take it into account for subsequent posts.

            Code:
            csdid cschoolenroll hhsize wealthindex childdependencyratio b2.relhhheadstatus sexhhhead agehhhead educhhhead b1.religionstatus b2.ethnicitystatus i.region i.year i.age, time(year) gvar(first_treated) method(drimp) cluster(region) rseed(1)
            Here you have the code

            Comment


            • #7
              Hi Daniel
              a few questions on your results
              1) how many observations you have?
              2) how many you have per cohort and year ? (you can see this using - tab year gvar -.

              My best guess is that you are overfitting the model. Specifically by using "i.region i.year i.age", you are probably introducing way more controls than observations per year/cohort, violating overlapping assumption.

              Also, you DO NOT need to add "i.year" because you are already declaring year in "time". And you can use i.region, only if you have both untreated and "treated across all cohort" units for each region.
              HTH
              Fernando

              Comment


              • #8
                Originally posted by FernandoRios View Post
                Hi Daniel
                a few questions on your results
                1) how many observations you have?
                2) how many you have per cohort and year ? (you can see this using - tab year gvar -.

                My best guess is that you are overfitting the model. Specifically by using "i.region i.year i.age", you are probably introducing way more controls than observations per year/cohort, violating overlapping assumption.

                Also, you DO NOT need to add "i.year" because you are already declaring year in "time". And you can use i.region, only if you have both untreated and "treated across all cohort" units for each region.
                HTH
                Fernando
                Hi Fernando,

                Thank you so much for your comments. I have 8,323 obs (cross-sectional survey identification=year, treatment code=gvar):

                Code:
                cross-sect |
                ional |
                survey |
                identifica | treatment code
                tion | 0 2005 2011 | Total
                -----------+---------------------------------+----------
                2000 | 305 1,829 660 | 2,794
                2005 | 300 1,665 680 | 2,645
                2011 | 400 1,592 962 | 2,954
                -----------+---------------------------------+----------
                Total | 1,005 5,086 2,302 | 8,393
                Great, I suspected that i.year didn't make sense since it's already included in your command. I'm actually comparing regions that received a treatment with regions that did not (or not already) across time (using 3 surveys in 2000, 2005, and 2011), therefore early-treated, late-treated and never-treated cohorts are region-specific. Idk whether that means that I can't use region FE.

                Thanks!

                Comment


                • #9
                  Precisely. You cant use region specific fixed effects, because that violates the overlapping assumption.
                  If you thinking about the "logit" model run behind csdid, region FE will fully explain why some units are "treated" and some are not.

                  Given your limited data, perhaps it would be better to think of this as a set of separate DID. CSDID will not give you as much as you want, because the gaps in time are irregular.
                  F

                  Comment


                  • #10
                    Originally posted by FernandoRios View Post
                    Precisely. You cant use region specific fixed effects, because that violates the overlapping assumption.
                    If you thinking about the "logit" model run behind csdid, region FE will fully explain why some units are "treated" and some are not.

                    Given your limited data, perhaps it would be better to think of this as a set of separate DID. CSDID will not give you as much as you want, because the gaps in time are irregular.
                    F
                    Okay. I perfectly understand your point. This is very helpful

                    Btw, I was planning to use LPM instead of logit, would that change something?

                    Comment


                    • #11
                      CSDID doesnt have an option for LPM.
                      You could use the same principles and create your own estimator based on LPM, however the same principle stands. You need to have both treated and untreated units for any set of values in your controls. If you can perfectly predict who is treated or not based on characteristics, then DID will not work.

                      Comment


                      • #12
                        Originally posted by FernandoRios View Post
                        CSDID doesnt have an option for LPM.
                        You could use the same principles and create your own estimator based on LPM, however the same principle stands. You need to have both treated and untreated units for any set of values in your controls. If you can perfectly predict who is treated or not based on characteristics, then DID will not work.
                        Makes so much sense. Thanks a lot for your kind help Fernando.

                        Comment


                        • #13
                          Daniel, I believe you are referring to the outcome model, whereas FernandoRios is referring to the treatment model. CSDID is doubly robust, which means that it combines a form of outcome regression with a model for the treatment. You can use a binary outcome variable and refer to the outcome model as a linear probability model, but this does not affect the model for the treatment which still takes a logit form. This reply is just to clear up a potential misunderstanding, it does not take away from the advise in the earlier replies.

                          Comment


                          • #14
                            Originally posted by Øyvind Snilsberg View Post
                            Daniel, I believe you are referring to the outcome model, whereas FernandoRios is referring to the treatment model. CSDID is doubly robust, which means that it combines a form of outcome regression with a model for the treatment. You can use a binary outcome variable and refer to the outcome model as a linear probability model, but this does not affect the model for the treatment which still takes a logit form. This reply is just to clear up a potential misunderstanding, it does not take away from the advise in the earlier replies.
                            Thanks for the clarification!

                            Comment


                            • #15
                              Hello again FernandoRios,

                              I'm working now on my first PhD chapter and I kind of have the sample issue, the outcome of my csdid is missing values. Let me show you some context + my data: I'm studying the effect of a policy on female education in Ethiopia, the rollout of the law was by region, so the treatment is staggered. I checked that the problem is in a middle ground between my controls and the double robust estimators, when I reduce the covariates or change to method(reg) the csdid perfectly works, otherwise the outcome is the csdid table filled by missing values. I'm using not-yet-treated as a control group, I suspect it would be a much better control groups. Note that gvar_gh denotes the first-treated cohorts. Thank you in advance.

                              Code:
                              * Example generated by -dataex-. For more info, type help dataex
                              clear
                              input str15 id byte region float(cohort gvar_gh)
                              "       1  17  2"  4 1978 1985
                              "       1  17  3"  4 1999 1985
                              "       1  18  2"  4 1974 1985
                              "       1  25  2"  4 1970 1985
                              "       1  25 10"  4 1998 1985
                              "       1  45  1"  4 1980 1985
                              "       1  65  2"  4 1984 1985
                              "       1  90  3"  4 1992 1985
                              "       1 125  2"  4 1998 1985
                              "       1 133  2"  4 1971 1985
                              "       1 183  2"  4 1987 1985
                              "       2  29  2"  5 1995    0
                              "       2  97  1"  5 1979    0
                              "       2 165  1"  5 1977    0
                              "       2 168  1"  5 1993    0
                              "       2 171  2"  5 1985    0
                              "       2 176  1"  5 1990    0
                              "       2 234  1"  5 1979    0
                              "       2 234  2"  5 2001    0
                              "       2 235  2"  5 1995    0
                              "       2 340  4"  5 2001    0
                              "       2 399  2"  5 1986    0
                              "       2 437  2"  5 1995    0
                              "       2 463  2"  5 1999    0
                              "       3  14  2"  3 1996 1985
                              "       3  14  3"  3 1996 1985
                              "       3  17  2"  3 1990 1985
                              "       3  22  2"  3 1985 1985
                              "       3  31  3"  3 1981 1985
                              "       3  31  5"  3 2000 1985
                              "       3  67  2"  3 1988 1985
                              "       3 100  2"  3 1988 1985
                              "       3 115  2"  3 1987 1985
                              "       3 127  2"  3 1976 1985
                              "       3 131  4"  3 1998 1985
                              "       3 141  2"  3 1982 1985
                              "       3 168  2"  3 1983 1985
                              "       3 188  3"  3 1996 1985
                              "       3 241  5"  3 1991 1985
                              "       3 243  1"  3 1988 1985
                              "       3 250  2"  3 1997 1985
                              "       3 286  4"  3 1999 1985
                              "       3 300  3"  3 1994 1985
                              "       3 300  4"  3 2000 1985
                              "       5   4  1" 11 1988 1982
                              "       5  20  3" 11 1997 1982
                              "       5  26  1" 11 1980 1982
                              "       5  26  6" 11 1995 1982
                              "       5  26  7" 11 1999 1982
                              "       5  26  8" 11 1995 1982
                              "       5  43  1" 11 1981 1982
                              "       5  99  1" 11 1975 1982
                              "       5  99  5" 11 1998 1982
                              "       5 106  1" 11 1967 1982
                              "       5 136  2" 11 1987 1982
                              "       5 148  1" 11 1995 1982
                              
                              end
                              label values region V024
                              label def V024 2 "afar", modify
                              label def V024 3 "amhara", modify
                              label def V024 4 "oromia", modify
                              label def V024 5 "somali", modify
                              label def V024 11 "dire dawa", modify
                              Code:
                              csdid neverattend rural b1.religion b1.ethnicity i.region siblings older_male_siblings older_female_siblings, time(cohort) gvar(gvar_gh) method(drimp) notyet cluster(ethnicity) wboot
                              Code:
                              gregorian |
                                 year of |                              gvar_gh
                                   birth |         0       1982       1985       1986       1989       1990 |     Total
                              -----------+------------------------------------------------------------------+----------
                                    1966 |        10         11         16         11         17         10 |        75
                                    1967 |        31         36         43         19         17         19 |       165
                                    1968 |        23         26         50         20         25         24 |       168
                                    1969 |        25         23         48         15         27         20 |       158
                                    1970 |        54         45         70         28         33         62 |       292
                                    1971 |        20         31         41         24         37         40 |       193
                                    1972 |        18         27         38         17         22         25 |       147
                                    1973 |        36         41         54         28         28         51 |       238
                                    1974 |        25         32         54         36         13         33 |       193
                                    1975 |        82         90        103         54         41         99 |       469
                                    1976 |        64         58         72         48         38         70 |       350
                                    1977 |        44         57        103         50         45         68 |       367
                                    1978 |        50         50         75         37         32         54 |       298
                                    1979 |        32         32         82         26         34         45 |       251
                                    1980 |        85        113        120         65         68        145 |       596
                                    1981 |        60         61         89         55         45         97 |       407
                                    1982 |        31         59         86         28         33         42 |       279
                                    1983 |        50         84         92         49         35         65 |       375
                                    1984 |        39         65        101         39         29         68 |       341
                                    1985 |       124        143        161         81         47        158 |       714
                                    1986 |       113         91        105         65         39        117 |       530
                                    1987 |        84        110        140         81         53        105 |       573
                                    1988 |        89         87        118         67         47         98 |       506
                                    1989 |        72        105        103         60         45         97 |       482
                                    1990 |       124        125        175         97         59        140 |       720
                                    1991 |        97        126        136         68         53        141 |       621
                                    1992 |        68         93         85         60         55        102 |       463
                                    1993 |        76         91        110         62         52        121 |       512
                                    1994 |        69         93         95         58         61        100 |       476
                                    1995 |       160        141        160         65         67        133 |       726
                                    1996 |       105        130        105         60         75        124 |       599
                                    1997 |       125        141        155         85         79        133 |       718
                                    1998 |       101        156        165         76        101        122 |       721
                                    1999 |       104        138        152         67         74        131 |       666
                                    2000 |       148        112        155        106         83        130 |       734
                                    2001 |        51         27         38         16         26         36 |       194
                              -----------+------------------------------------------------------------------+----------
                                   Total |     2,489      2,850      3,495      1,823      1,635      3,025 |    15,317
                              Last edited by Daniel Perez Parra; 30 Nov 2022, 05:10.

                              Comment

                              Working...
                              X