Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • clogit or logit for a Difference-in-Difference model with fixed effects

    Hi everyone,

    I was hoping you might be able to help with this question for my thesis. I'm running a Difference-in-Difference model with repeated cross-sectional data (not panel data). My outcome variable is binary, which means I will use an odds ratio instead of marginal effects as suggested here: https://journals.sagepub.com/doi/pdf...867X1001000211

    However, I have group and time fixed effects since it is a Diff-in-Diff model over several groups and time periods. Should I be using logit or clogit for such a model?

    Thank you kindly,

    Lucas

  • #2
    The purpose of using -clogit- (or equivalently -xtlogit-) is to properly account for the non-independence of repeated observations of the same units of analysis. If you have serial cross sectional data, without repetitions of the same units of analysis, there is no such problem and -logit- is appropriate.

    The groups you are presumably referring to in your second paragraph are the treated and untreated groups: but if the same units of analysis are not followed up over time, there is no problem of non-independence among observations. Using -clogit- with the treatment groups as the -clogit- grouping variable would be inappropriate.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      The purpose of using -clogit- (or equivalently -xtlogit-) is to properly account for the non-independence of repeated observations of the same units of analysis. If you have serial cross sectional data, without repetitions of the same units of analysis, there is no such problem and -logit- is appropriate.

      The groups you are presumably referring to in your second paragraph are the treated and untreated groups: but if the same units of analysis are not followed up over time, there is no problem of non-independence among observations. Using -clogit- with the treatment groups as the -clogit- grouping variable would be inappropriate.
      Hi Clyde, thank you for such a prompt response! The groups I refer to are not specifically treated and untreated groups. Rather my model specification is:

      Y = Treated*Post + Treated + Post + Year Fixed Effects + District Fixed Effects + e

      Treatment occurs at a more granular level than the districts, specifically at a survey cluster level. I have four survey years, but the survey clusters are different each time, so I can only include fixed effects for each district.

      Is logit still the appropriate command?

      Comment


      • #4
        Ok, that wasn't at all clear from #1. If the same (or largely overlapping sets) districts were surveyed in both years, it might be reasonable to use -xtlogit, fe- with district as the grouping variable. If district does influence the outcome, then this would be quite appropriate. If you are not sure, you can let the data decide. Run it with district as the grouping variable in -xtlogit, fe- (or -clogit-, same thing) and in the output Stata will tell you whether the district fixed effects are ignorable. If they are, you can just rerun it without them in -logit-.

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          Ok, that wasn't at all clear from #1. If the same (or largely overlapping sets) districts were surveyed in both years, it might be reasonable to use -xtlogit, fe- with district as the grouping variable. If district does influence the outcome, then this would be quite appropriate. If you are not sure, you can let the data decide. Run it with district as the grouping variable in -xtlogit, fe- (or -clogit-, same thing) and in the output Stata will tell you whether the district fixed effects are ignorable. If they are, you can just rerun it without them in -logit-.
          Thanks Clyde, I tried running the code
          Code:
          xtset District Time
          but I get an error message "repeated time values within panel". I think this is because I have lots of observations (i.e. different people) for each district and time. The outcomes are recorded at the individual level, but these individuals are not repeated over time. I don't believe I can run xtlogit without setting up my data as panel data using xtset?

          Alternatively, do you think it would be better to just run OLS?

          Comment


          • #6
            I don't believe I can run xtlogit without setting up my data as panel data using xtset?
            That's true. But you are using -xtset- incorrectly for this data. Run -xtset District- with no time variable and you're good to go. The time variable is always optional. It is only needed if you plan to use time-series operators like lags and leads, or estimate models with autoregressive structure. But such things are only meaningful in panel data. When using cross sections with some nesting structure, as is the case here, just don't specify a time variable, and then all the meaningful -xt- commands are at your disposal.

            By the way, this comes up fairly often here on Statalist. A lot of people seem to be under the mistaken impression you have to specify a time variable in -xtset-. I'm curious where that notion comes from.

            Comment


            • #7
              Originally posted by Clyde Schechter View Post
              That's true. But you are using -xtset- incorrectly for this data. Run -xtset District- with no time variable and you're good to go. The time variable is always optional. It is only needed if you plan to use time-series operators like lags and leads, or estimate models with autoregressive structure. But such things are only meaningful in panel data. When using cross sections with some nesting structure, as is the case here, just don't specify a time variable, and then all the meaningful -xt- commands are at your disposal.

              By the way, this comes up fairly often here on Statalist. A lot of people seem to be under the mistaken impression you have to specify a time variable in -xtset-. I'm curious where that notion comes from.
              Thanks so much for all your help Clyde, I ran into a problem, which has been addressed here.

              I think running an LPM will be most feasible.

              All the best,

              Lucas


              Comment

              Working...
              X