Non-negative count variable with many zeros in staggered Difference-in-Difference design

Stefan Sliwa

Join Date: Jun 2019

Posts: 26
#1

Non-negative count variable with many zeros in staggered Difference-in-Difference design

30 Mar 2026, 14:04

Hi all,

First of all, I would like to apologize if my question sounds rather ignorant. I am neither a programmer nor a statistician, thus this question might seem trivial for some of you.

I would like to implement a staggered Difference-in-Difference design for a non-negative count variable with many zeros. For the traditional TWFE DiD estimator, there is the ppmlhdfe package which readily takes care of it. What are my options for a staggered DiD adoption?

My preferred estimator is Callaway & Sant’anna’s csdid, but I am also aware of the other ones (eventstudyinteract, did_multiplegt_dyn, jwdid). From these commands, only Woolridge’s jwdid command seems to integrate a poisson estimation method directly. However, I would like to use the "notyettreated" as my comparison group while testing for pre-trends. I have read that regression-type methods should not easily allow that, yet only jwdid excludes them from the output when specifying the comparison this way (for reasons I frankly do not understand).

Are non-negative count outcomes with many zeros a large issue for these estimators, similar to the TWFE estimator? I found this technical paper (see link) which claims just that. My options in applying staggered DiD designs for count variables seems to be very limited.
Can somebody give me some perspective?

Thanks a lot!!

Just a moment...

https://papers.ssrn.com
Tags: None
Stefan Sliwa

Join Date: Jun 2019

Posts: 26
#2

30 Mar 2026, 19:49

Sorry, the link does not seem to work so here in text: https://papers.ssrn.com/sol3/papers....act_id=4859576
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2534
#3

31 Mar 2026, 06:29

JWDID will be your best bet indeed.
But you are right, you cannot test for pre-trends with not-yet treated, because correctly doing that requires more regressions than a single one (since comparison group need to be changed for each pre-period)
Others will also work foryour data, but just as in the OLS vs Poisson discussion, the point remains that OLS assumes linearity, which may not work when your data is non-linear, and requires a non-linear model estimator.
F
Comment
Stefan Sliwa

Join Date: Jun 2019

Posts: 26
#4

31 Mar 2026, 18:45

Hi Fernando, thanks for your input.

What about Callaways & Sant'anna's estimator, which is not a linear one?
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2291
#5

31 Mar 2026, 21:47

Hi Stefan. Callaway and Sant'Anna assumes that conditional parallel trends holds in the levels -- just like all other estimators except for the Poisson and logit and fractional logit estimators that I discuss in my work. The doubly robust CS augmented AIPW estimator has some resiliency to misspecified functional form, but not violation of the levels PT assumption. At a minimum, you can try jwdid with a linear mean, jwdid with an exponential mean (with the assumption being that PT holds in the log of the mean, not the change in the level), and csdid. YOu can always start by using the never treated units and then add "never" to jwdid to test for pre-trends.
Comment
Stefan Sliwa

Join Date: Jun 2019

Posts: 26
#6

02 Apr 2026, 18:16

Hi Jeff, thanks a lot for your input too.

YOu can always start by using the never treated units and then add "never" to jwdid to test for pre-trends.

I assume you mean I can start with the notyet treated units and then add the never ones? Thing is I do not have never treated units, but I get your point. Thanks a lot, I will follow your advice!
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2534
#7

03 Apr 2026, 10:06

even if you have NO never treated, If you restrict your sample accordingly, the Never could be those treated last
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2291
#8

03 Apr 2026, 12:09

Stefan: Yes, that's what I meant -- sorry.
Comment
Stefan Sliwa

Join Date: Jun 2019

Posts: 26
#9

29 Apr 2026, 05:40

Hi, thanks again for your help!

I have a follow-up question: I am struggling to implement my model with the jwdid+poisson estimator.
Whereas csdid uses pre-treatment covariates to match treated with control units, jwdid relies on time-variant covariates. I have fine-grained data so even just adding one covariate makes the command compute forever. What is your practical advice? I am aware of the different options (xasis, exovar, xtvar etc. ) but strangely, these options do not speed up the process much + it is quite important to control for some unit characteristics.

Could you give me a rundown of the different options (for example what to consider when xtvar vs xgvar vs xattvar). and potential options to make it compute quicker?

A bit about my setting:
I have monthly data on municipalities (~5.600) over 10 years and want to examine the impact of a treatment on municipality-level raw count data with many zeros. I have already aggregated the time period to 3-months period, to reduce the sample size. I could aggregate if further, but short-term changes around treatment adoption might be crucial.

Again, many thanks for your support!
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2291
#10

03 May 2026, 11:47

With so many time periods and such a large N, I’d try using Poisson regression with effects restricted to be constant by exposure time. You can see if the estimates are much different from the unrestricted model estimated by jwdid after estat event. If they’re close without covariates then I’d probably use the restricted model. I have a Stata do file for that somewhere.
Comment
Stefan Sliwa

Join Date: Jun 2019

Posts: 26
#11

04 May 2026, 02:43

Alright, that sounds good, thank you Jeff!

So just to be sure, my command flow would be something like:

Code:

... ** Unrestricted model, no covariates jwdid y, ivar(i) tvar(t) gvar(g) method(poisson) estat event ** vs restricted model, no covariates jwdid y, ivar(i) tvar(t) gvar(g) method(poisson) hettype(event) estat event ** if similar, add time-invariant covariates jwdid y x1 x2 x3, ivar(i) tvar(t) gvar(g) method(poisson) hettype(event) estat event

Is this correct?

I am also confused whether I should use 'option(poisson)' or 'option(ppmlhdfe)'. They show very different results for my estimations. Can both be used with time-invariant (pre-treatment) covariates?

Thanks again.
Comment

Announcement

Non-negative count variable with many zeros in staggered Difference-in-Difference design

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment