Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to create proper dummy variable for Difference in Difference in do.file

    Hi my intervention period is both 2000 and 2001 in the difference-in-differences model. So I want to create a proper dummy variable for Post-intervention period.
    In that case, I wonder how to assign no value to the year 2000 and 2001. In other words, I assign the value 0 for all the years before 2000 and 1 for all the years after 2001. Then in stata do file, how could I assign no value for those intervention years of 2000 and 2001.

    Initially, I tried the following, but I am not sure if `NA` would be proper way to create the dummy for the DiD estimation.
    Code:
    gen post_intervention = 0
    replace post_intervention = 1 if year > 2001
    replace post_intervention = NA if year == 2000 & 2001
    Last edited by Chul-Kyoo Jung; 09 Apr 2023, 17:00.

  • #2
    What I usually do is something like
    Code:
    g treat = cond(id==3 & date => 2000, 1,0)

    Comment


    • #3
      Thanks Jared, do you mean I don't have to separately assign no values for 2000 and 2001? What if I just want to designate post-intervention period starts from 2002?

      Comment


      • #4
        Stata is just going to throw an error message at the code in #1, unless exceptionally NA is the name of an numeric variable or scalar. You're thinking there of some other software we will call X.

        The use of & is also wrong as now explained in detail at https://journals.sagepub.com/doi/pdf...6867X231162009 (published last week).

        See https://journals.sagepub.com/doi/pdf...36867X19830921 if you want a fairly comprehensive discussion of indicator variables, including a preference for that term.

        The question is still puzzling. A missing value for 2000 and 2001 will just exclude those years from almost any later analysis.

        Perhaps you seek separate indicators for being pre- and post- which have in common being 0 for 2000 and 2001.

        Comment


        • #5
          Before we continue, we must clarify a few things. Is more than one unit treated? If yes, are they treated at different times? I don't really see how the intervention can have two start points, unless one period of treatment is a sort of "phrase in" period. I'm not understanding, otherwise.

          Comment


          • #6
            Thanks Nick Cox for your wonderful reference. Also thanks for Jared Greathouse for the clarification. The following is the detail which I presented in another post. My dataset covers from 1990 to 2010 and the policy was implemented in 2000 and continued until 2001. The treated units were a group of firms. Some firms were treated in 2000 and others were treated 2001. The problem is I cannot identify which firms were which. So I just designated all of them as a single treated group and the treatment period is 2000 and 2001. Could you let me know how to code the treated group and treated period in this case?
            Last edited by Chul-Kyoo Jung; 10 Apr 2023, 12:41.

            Comment


            • #7
              The treated units were a group of firms. Some firms were treated in 2000 and others were treated 2001. The problem is I cannot identify which firms were which.
              Do you mean you don't know in the sense of LITERALLY don't know which firms were treated (like, the list of treated units is unavailable to you/is private), or do you mean you don't know how to identify in code which units were and weren't treated? I can help you, but you need to show us your data using the dataex command.

              Once I'm looking at some of your data, then we can have a better, or at least more productive, discussion, since I'll be able to see your data as you see it.

              Comment


              • #8
                I literally don't know which firms were treated in 2000 and which were in 2001, so I just put both groups together as a single treated group. In sum, treated period is both 2000 and 2001 and treated group is lbg30.

                Code:
                * Example generated by -dataex-. For more info, type help dataex
                clear
                input float year str6 firmid byte ksic2c float(k_productivity hhi_sales lbg30 top30sharei lshare)
                1992 "000020" 21   588529984  .05352982 0          0  .4364
                1993 "000020" 21   744310016  .05338813 0          0  .4415
                1994 "000020" 21   608419968   .0517649 0          0  .4982
                1995 "000020" 21   7.021e+08   .0512402 0          0  .4797
                1996 "000020" 21   779320000  .05364646 0          0  .4948
                1997 "000020" 21   9.117e+08 .071214095 0          0  .5006
                1998 "000020" 21   991529984  .07117989 0          0  .4273
                1999 "000020" 21  1964070016  .07088037 0          0  .2614
                2000 "000020" 21   502310016  .06271941 0          0  .8708
                2001 "000020" 21  1313059968    .055039 0          0   .556
                2002 "000020" 21  1786019968  .05649804 0          0  .5746
                2003 "000020" 21  1744419968  .05231318 0   .0277531  .6055
                2004 "000020" 21  1746249984  .05077323 0  .02407099  .6867
                2005 "000020" 21  1473859968  .05311356 0  .02611842  .5684
                2006 "000020" 21  1287030016  .05217143 0  .02535325  .6056
                2007 "000020" 21    1.07e+09  .05387806 0 .022888433  .6112
                2008 "000020" 21 27293939712  .05611051 0  .02041181  .4473
                2009 "000020" 21  2445459968  .05811165 0  .01744266  .4235
                2010 "000020" 21   217370000   .0560264 0  .01730872  .5767
                2011 "000020" 21   210430000  .05059672 0   .0147009  .5075
                1992 "000040" 31   723260032  .48241925 0    .875504  .5819
                1993 "000040" 31   820640000  .48116615 0   .9667003  .5178
                1994 "000040" 31  1063180032  .50073665 0   .9780892  .6232
                1996 "000040" 31   657430016  .49643224 0   .9655888   .414
                1998 "000040" 31   138540000   .4526372 0   .9864518   .621
                2001 "000040" 31   127340000   .2890171 0   .8169259  .8521
                2003 "000040" 31   362630016  .27689195 0   .9959713  .4164
                2005 "000040" 31   120380000  .27040976 0   .9956971  .2975
                2006 "000040" 31    51940000  .27439922 0   .9961551 1.1796
                2007 "000040" 31   257520000  .27405494 0   .9964164  .2167
                2010 "000040" 31   157460000   .2815831 0   .9973839   .163
                2011 "000040" 31    8.60e+07   .3801102 0    .969853  .2409
                1992 "000050" 47   199160000  .34448385 0  .10926908  .6793
                1993 "000050" 47   195150000   .3559043 0  .11468977  .7178
                1994 "000050" 47   301750016   .3546242 0  .10875304  .5764
                1995 "000050" 47   442369984   .3592656 0  .10488386  .5844
                1996 "000050" 47   327750016   .3725042 0   .0869855  .6432
                1997 "000050" 47   422470016   .3836862 0   .0825581  .5627
                1998 "000050" 47   640760000   .3867261 0  .08987462  .4604
                1999 "000050" 47   929169984   .3996449 0  .08584384  .5143
                2000 "000050" 47   200840000  .33890685 0   .8212672  .5777
                2001 "000050" 47   195320000  .35462224 0   .9863203  .6374
                2002 "000050" 47   280409984   .3785505 0   .9879443   .551
                2003 "000050" 47   3.366e+08   .3876458 0   .9831518  .6876
                2005 "000050" 47   4.352e+08   .4127032 0   .9907051  .1305
                2007 "000050" 47  1138919936   .4149574 0   .9924192  .0447
                2009 "000050" 47    94220000   .4066013 0   .9926313  .1152
                2010 "000050" 47   188150000   .4055971 0   .9901092  .0569
                2011 "000050" 47    87650000  .56482244 0    .913202   .086
                1992 "000070" 71   268120000   .0786362 0   .6988313  .4794
                1993 "000070" 71   1.973e+08  .07177163 0    .712829  .4826
                1994 "000070" 71    95210000  .07273845 0   .7131559  .5245
                1995 "000070" 71   125360000  .07342871 0   .7148253  .4993
                1996 "000070" 71   1.202e+08 .072934516 0   .7074937  .5574
                1997 "000070" 71   124480000   .0781921 0   .7431347  .4979
                1998 "000070" 71    90530000  .07776387 0   .7463828  .3591
                1999 "000070" 71   1.061e+08  .07633055 1   .7860447  .3336
                2000 "000070" 71   1.213e+08  .07649403 0   .7330863  .3717
                2001 "000070" 71   223430000 .068164214 0   .6619592  .3247
                2002 "000070" 71   179140000 .067267805 0   .6489187  .3011
                2003 "000070" 71   160610000 .073163316 0   .6581174  .3953
                2004 "000070" 71   163840000   .0745528 0   .5721435  .2468
                2005 "000070" 71   181240000 .071113676 0  .54938686  .2641
                2006 "000070" 71   133430000  .07161281 0   .5551959  .3372
                2007 "000070" 71   108510000  .07922857 0  .50252575  .3959
                2009 "000070" 71   228260000    .104891 0   .4726001   .232
                2010 "000070" 71   274460000  .09211212 0  .50564134  .1768
                1992 "000080" 11   666329984   .5063352 1   .9237077  .5111
                1993 "000080" 11   4.961e+08   .4772263 1   .9184917  .4148
                1994 "000080" 11   371140000    .484674 1   .9256746  .3457
                1995 "000080" 11   379240000   .4526424 1   .9295023  .4394
                1996 "000080" 11   3.686e+08   .4169289 1   .8886818  .4654
                1997 "000080" 11   342590016   .4172299 1    .893319  .3858
                1998 "000080" 11   394180000    .406155 1   .8832415  .2884
                1999 "000080" 11   5.059e+08   .3698842 1   .8357728  .2843
                2001 "000080" 11   348460000   .4423804 0  .59209883   .454
                2002 "000080" 11   211420000   .4422717 0  .58359474  .6478
                2009 "000080" 11   342220000   .4075448 0   .5522526  .2534
                2010 "000080" 11   345830016   .4097762 0   .5573384  .3394
                2011 "000080" 11    3.58e+07    .444746 0   .5610578  .3654
                1992 "000100" 21   838990016  .05352982 0          0  .6757
                1993 "000100" 21   894609984  .05338813 0          0  .6775
                1994 "000100" 21   961889984   .0517649 0          0   .701
                1995 "000100" 21  1.0516e+09   .0512402 0          0  .6769
                1996 "000100" 21   938689984  .05364646 0          0  .7363
                1997 "000100" 21  1217949952 .071214095 0          0  .6192
                1998 "000100" 21  1954390016  .07117989 0          0  .3821
                1999 "000100" 21  2420780032  .07088037 0          0  .3312
                2000 "000100" 21  1609560064  .06271941 0          0  .3875
                2001 "000100" 21  1816489984    .055039 0          0  .4049
                2002 "000100" 21  1453410048  .05649804 0          0  .3436
                2003 "000100" 21  1243570048  .05231318 0   .0277531  .3247
                2004 "000100" 21  1069459968  .05077323 0  .02407099  .2913
                2005 "000100" 21  1691129984  .05311356 0  .02611842  .2975
                2006 "000100" 21   560849984  .05217143 0  .02535325  .3071
                2007 "000100" 21   897340032  .05387806 0 .022888433  .3254
                2008 "000100" 21  1039760000  .05611051 0  .02041181  .3194
                2009 "000100" 21  1197410048  .05811165 0  .01744266  .3519
                2010 "000100" 21  1.7069e+09   .0560264 0  .01730872  .3253
                2011 "000100" 21  1098310016  .05059672 0   .0147009  .5041
                end

                Comment


                • #9
                  The fact that it is impossible to discern the current treatment status of any observation in 2000 and 2001 is deeply problematic. I think the best solution is to simply drop all observations from those years. While this is, in a sense, throwing away data, the data being thrown away is more like disinformation than information. If you include it, no matter what you do with your pre-post variable, you will end up with biased estimation of the treatment effect. I think those observations have to go.

                  Comment


                  • #10
                    Clyde Schechter Thanks so much. I am still trying to figure out the exact treated units. Now I know for sure that suppose I would get those information, I can go with two way fixed effects model as you suggested in another post. Now just one last question for some hypothetical situation: suppose the policy was implemented for the every entities in the treated group but still I could not identify them, then can I just drop all those 2000, 2001 observations and run a simple DiD with pre-intervention "0" for 1990-1999 and post-intervention "1" for 2002-2010? Thanks.

                    Comment


                    • #11
                      See #9 at https://www.statalist.org/forums/for...nce-in-do-file.

                      Comment


                      • #12
                        Clyde Schechter Thanks I got it!

                        Comment

                        Working...
                        X