Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference-in-difference estimators

    Hello everyone, I am learning STATA as a beginner, so I need your help. My data is like this.
    id treatment (time variable, take 0 before treatment, 1 after treatment) group (take 0 in New Jersey, take 1 in PA) employment (independent variable)
    1 0 1 16
    1 1 1 20
    2 0 1 4
    2 1 1 2
    3 0 0 7
    3 1 0 8
    4 0 0 4
    4 1 0 6
    We are supposed to run a regression to estimate the effect of the policy in New Jersey, and the employment represents the employment in every small restaurant in each state. My code is like this:

    xtset id treatment

    STATA said, "repeated time values within the panel." In this case, could you please give me some guidance about how I can set my panel data and run the regression to estimate the effect of the policy run in New Jersey (each state has many small restaurants, so I really do not know how to deal with this case)? Thanks for the kind help in advance, and hope you all have good luck today!



  • #2
    Well, Stata is telling you that somewhere in your full data set there is one or more id that has two or more observations with the same value of treatment. The problem is with your data. To find the offending observations, run:
    Code:
    duplicates tag id treatment, gen(flag)
    browse if flag
    Then you have to decide what to do about it. If those observations are actually correct (i.e. you really have more than one pre- or more than one post-treatment observation on the same id) then you can't use -xtset id treatment-. You could still use -xtset id-, though, as I don't know what you intend to do beyond this point, that may or may not serve your purposes. If however, you see that there are extra observations that shouldn't be there, then you need to get rid of them. Resist the temptation to just delete them from the data: you should rather review the data management that created this data set in the first place, find out how those spurious observations got there, and fix those coding errors. When you hunt down those coding errors, you may find others as well, and you should fix those, too. (If the data set was created by somebody else, ask them to do that.)

    Comment


    • #3
      Thanks for the timely and warm reply! I tried the code and found that those observations were correct, not duplicate data.

      I am estimating the effect of policy on employment in New Jersey (when the group is 0) compared with the other state, Pennsylvania. The dependent variable is the number of employed people, and the independent variable is the group dummy, the treatment dummy, and the interaction between the two. However, since each state (New Jersey and Pennsylvania) consists of different restaurants, the data we have is thus the number of employed people in each restaurant (we only care about employment in the restaurant). So these components consist of the data above.

      In this case, if I want to estimate the effect of the policy in New Jersey, which includes many small restaurants, what can I do to see its impact? Thanks for the kind help!!! Your help counts to me!

      Comment


      • #4
        Note that this is my opinion, and others will likely disagree with me, and that's cool. You should not xtset your data as you've specified above. Instead, you wanna do

        Code:
        qbys id: g time = _n
        
        xtset id time, g
        Because now you have a more natural coding of your time periods. Clyde Schechter is right though, you'll need to figure out what to do with duplicates. I don't know if you created this dataset from other sources or it was forced upon you, but, either way, you'll need to decide how to get rid of them. In my experience, if I made the dataset, the fault is usually my own! You can then use xtdidregress or you can use normal OLS to specify the interaction term, if you'd like

        Code:
        u "https://github.com/alopatina/Applied-Causal-Analysis/blob/master/DinD_ex.dta?raw=true", clear
        
        
        cls
        
        
        
        reg fte i.nj##i.after
        
        bys sheet: g time = _n
        
        g treat = cond(nj==1 & time == 1,1,0)
        
        xtset sheet time, g
        
        xtdidregress (fte) (treat), group(sheet) time(time)
        Last edited by Jared Greathouse; 03 Dec 2022, 05:48.

        Comment


        • #5
          Thank you for all! Sorry for replying late. I have already figured out the problem.

          Comment

          Working...
          X