Parallel trend test DID with binary variable

Marry Lee

Join Date: Nov 2020

Posts: 189
#1

Parallel trend test DID with binary variable

30 Apr 2021, 13:00

Dear all,
I went through different posts but did not find an answer for my question.
I have a binary dependent variable for which I would like to test trends before the reform.
I did the following:

Code:

probit Y c.TCZ#ib1997.year_birth X, cluster(coun) margins year_birth, dydx(TCZ) noestimcheck post marginsplot, yline(0) name(test)

where TCZ is a binary varible (=1 if it is a region where the reform is implemented)
the reform is implemented in 1998
year_birth is the different years of birth where the coefficient for 1997 should be normalized to 0.

My problem is that with this code, I don't get the 1997 coefficient to be 0. Any suggestions please? Am I doing something wrong? Is it possible to use probit in a parallel trend test?

Thank you in advance.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30066
#2

30 Apr 2021, 22:25

Actually, with that code you don't get any coefficients for year_birth at all. A more conventional way to parameterize this model is:

Code:

probit Y c.TCZ##ib1997.year_birth X, cluster(coun)

If you do it that way, you will get indicators for year_birth, with 1997 as the omitted reference category, and you will get interaction coefficients for TCZ with each year_birth value other than 1997 as well. Your code using # instead of ## is a different model which constrains the level effects for all the years to be zero. Unless you have a strong reason to impose that constraint, you should use the ## model.

If this isn't what you were looking for, please post back showing the output you got from the various commands you have run and explain specifically how they differ from what you are expecting.

Last edited by Clyde Schechter; 30 Apr 2021, 22:30.
Comment
Marry Lee

Join Date: Nov 2020

Posts: 189
#3

01 May 2021, 13:39

Thank you so much Clyde Schechter for your answer.
In fact, the problem persists even with a double #.
when I use the following command:

Code:

areg High_Q_S TCZ c.TCZ#ib1997.year_birth X, absorb(coun) cluster(coun)

I get the following:

but when I use the same regression with probit I get the following:

So with probit it does not get 1997 as the reference category.

Attached Files

Graph.gph (9.2 KB, 1 view)

Graph2.gph (8.1 KB, 1 view)
Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30066

01 May 2021, 17:33

Well, I can't reproduce your problem using similar code in another data set:

Code:

. clear*

.
. webuse nlswork
(National Longitudinal Survey of Young Women, 14-24 years old in 1968)

.
. areg ln_wage tenure c.tenure#ib80.year grade, absorb(idcode) cluster(idcode)
note: grade omitted because of collinearity.

Linear regression, absorbing indicators             Number of obs     = 28,099
Absorbed variable: idcode                           No. of categories =  4,697
                                                    F(15, 4696)       =  75.89
                                                    Prob > F          = 0.0000
                                                    R-squared         = 0.6649
                                                    Adj R-squared     = 0.5974
                                                    Root MSE          = 0.3032

                              (Std. err. adjusted for 4,697 clusters in idcode)
-------------------------------------------------------------------------------
              |               Robust
      ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
--------------+----------------------------------------------------------------
       tenure |   .0311567   .0018719    16.64   0.000     .0274869    .0348265
              |
year#c.tenure |
          68  |  -.0569011   .0112153    -5.07   0.000    -.0788883   -.0349138
          69  |  -.0212018   .0088658    -2.39   0.017    -.0385831   -.0038206
          70  |  -.0223542   .0060066    -3.72   0.000    -.0341299   -.0105784
          71  |  -.0045988    .004523    -1.02   0.309     -.013466    .0042684
          72  |  -.0010647   .0038513    -0.28   0.782     -.008615    .0064857
          73  |   .0000953    .003106     0.03   0.976    -.0059938    .0061845
          75  |  -.0049824   .0024415    -2.04   0.041    -.0097689   -.0001959
          77  |   .0031062   .0017526     1.77   0.076    -.0003297    .0065421
          78  |   .0031835    .001553     2.05   0.040      .000139    .0062281
          82  |   .0008954   .0013549     0.66   0.509    -.0017609    .0035518
          83  |   .0000481   .0014835     0.03   0.974    -.0028604    .0029565
          85  |   .0028007   .0015085     1.86   0.063    -.0001568    .0057581
          87  |   .0026366   .0015903     1.66   0.097    -.0004812    .0057544
          88  |   .0028941    .001716     1.69   0.092    -.0004701    .0062583
              |
        grade |          0  (omitted)
        _cons |    1.58002   .0045884   344.35   0.000     1.571025    1.589016
-------------------------------------------------------------------------------

.
. probit union tenure c.tenure#ib80.year grade, cluster(idcode)

Iteration 0:   log pseudolikelihood = -10367.749  
Iteration 1:   log pseudolikelihood = -10079.604  
Iteration 2:   log pseudolikelihood = -10079.198  
Iteration 3:   log pseudolikelihood = -10079.198  

Probit regression                                       Number of obs = 19,008
                                                        Wald chi2(13) = 213.10
                                                        Prob > chi2   = 0.0000
Log pseudolikelihood = -10079.198                       Pseudo R2     = 0.0278

                              (Std. err. adjusted for 4,132 clusters in idcode)
-------------------------------------------------------------------------------
              |               Robust
        union | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
--------------+----------------------------------------------------------------
       tenure |   .0847883   .0073346    11.56   0.000     .0704128    .0991638
              |
year#c.tenure |
          70  |   .0047829   .0242844     0.20   0.844    -.0428136    .0523793
          71  |   .0178683    .018449     0.97   0.333    -.0182911    .0540277
          72  |   .0067912   .0159589     0.43   0.670    -.0244878    .0380701
          73  |      .0087   .0136836     0.64   0.525    -.0181193    .0355193
          77  |  -.0235311   .0074809    -3.15   0.002    -.0381934   -.0088688
          78  |   -.009677   .0066557    -1.45   0.146    -.0227219    .0033679
          82  |  -.0178203   .0054133    -3.29   0.001    -.0284302   -.0072104
          83  |  -.0363091   .0058035    -6.26   0.000    -.0476839   -.0249344
          85  |  -.0310086   .0060868    -5.09   0.000    -.0429385   -.0190787
          87  |  -.0413155   .0063034    -6.55   0.000      -.05367    -.028961
          88  |  -.0491349   .0064045    -7.67   0.000    -.0616875   -.0365823
              |
        grade |   .0250453   .0084425     2.97   0.003     .0084984    .0415923
        _cons |  -1.294242    .111488   -11.61   0.000    -1.512754   -1.075729
-------------------------------------------------------------------------------

As you can see, both regressions have 80 as the omitted category for year.

I do notice one thing. Your graph for the probit is not a graph of coefficients, it is a graph of marginal effects. Those are not the same thing in non-linear models, and a coefficient of zero does not imply a marginal effect of zero. For example, following the above, we have:

Code:

. margins year, dydx(tenure)

Average marginal effects                                Number of obs = 19,008
Model VCE: Robust

Expression: Pr(union), predict()
dy/dx wrt:  tenure

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
tenure       |
        year |
         70  |   .0277985   .0084152     3.30   0.001      .011305     .044292
         71  |   .0321629   .0062926     5.11   0.000     .0198297    .0444961
         72  |   .0284727   .0056016     5.08   0.000     .0174938    .0394516
         73  |   .0291122    .004872     5.98   0.000     .0195633    .0386612
         77  |   .0182665    .002925     6.24   0.000     .0125335    .0239994
         78  |   .0229218   .0027032     8.48   0.000     .0176236    .0282201
         80  |   .0261885   .0023131    11.32   0.000      .021655    .0307221
         82  |   .0201782    .001863    10.83   0.000     .0165269    .0238295
         83  |   .0140602   .0017112     8.22   0.000     .0107063    .0174142
         85  |   .0157902   .0014468    10.91   0.000     .0129546    .0186258
         87  |   .0124508   .0012593     9.89   0.000     .0099826    .0149191
         88  |   .0099944   .0011382     8.78   0.000     .0077636    .0122253
------------------------------------------------------------------------------

Notice that the marginal effect in year 80 is not zero, even though the coefficient was.

The coefficients of the interaction terms can be normalized with any base year you desire, because the difference will be made up in the coefficient of tenure (your TCZ) by itself. Choose a different base year and all of the coefficients will shift around to accommodate But the marginal effects will be the same regardless of the base year, because they represent the actual slopes of the outcome:TCZ relationship in the corresponding years. Those are real results, not artifacts of how you choose to parameterize the year variable.

Last edited by Clyde Schechter; 01 May 2021, 17:39.

Comment

Marry Lee

Join Date: Nov 2020

Posts: 189
#5

01 May 2021, 22:57

Clyde Schechter you are great, thank you so much. That's what I was missing: the fact that with probit I am plotting the margins. I was doing that because I thought I had to use the margins for the parallel trend testing but may be not necessarily since it is enough to have the sign and significancy of the coefficients and not their exact values, right?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30066
#6

02 May 2021, 11:16

it is enough to have the sign and significancy of the coefficients and not their exact values, right?

Well, I don't agree with that. I don't think the significance of the coefficients is relevant to this. If I even looked at the p-values at all (which I probably wouldn't, but most people would) I would not even consider for an instant whether they are < 0.05 (or any other threshold) at all. The statistical significance is a very noisy parameter and it is sensitive to different effective sample sizes for the coefficients in different groups or treatment statuses.

An important principle that should always be born in mind when working with p-values is that the difference between statistically significant and not statistically significant is, itself, not statistically significant. You should never draw any conclusions about the similarity or dissimilarity of any two things based on the concordance (or not) of their statistical significance; there is no context in which that method of inference is sound.

What matters is that the coefficients have the same sign and be of similar magnitude--just how similar is a matter for judgment and depends on the context. For my part, in these contexts, I am more inclined to graph the expected outcomes in each group over time and make a visual judgment as to whether the trends appear reasonably parallel.
Comment
Marry Lee

Join Date: Nov 2020

Posts: 189
#7

02 May 2021, 12:21

Clyde Schechter thank you again for your comments. So, in this situation, you think I have to just plot the means over time without really using the regression, is that what you are saying?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30066
#8

02 May 2021, 12:24

That is what I would do.
1 like
Comment
Marry Lee

Join Date: Nov 2020

Posts: 189
#9

07 May 2021, 04:12

Dear Clyde Schechter,
I am sorry to be back with the same question but re-reading what you have wrote in #6, I figured that first you said

I am more inclined to graph the expected outcomes in each group over time and make a visual judgment as to whether the trends appear reasonably parallel.

So you mentioned to plot expected outcomes not means, I think they are not the same, are they?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30066
#10

07 May 2021, 09:24

Sorry for not being clearer in my language. It actually depends. In your model you are adjusting for covariates, so they are not the same: I am referring to the expected outcomes predicted by the model--which would properly be called adjusted means. If, however, you did not adjust for anything in the model, then the expected values would be the same as the means of the observed outcomes. So my approach would be

Code:

probit Y i.TCZ##i.year_birth X margins TCZ#year_birth if year < 1988 marginsplot, xdimension(year)

I don't know why I didn't notice before, but until now in this thread we have been using c.TCZ in the modeling, even though you state clearly in #1 that TCZ is a dichotomous variable. While for the probit regression itself there is no harm done treating the dichotomy as if it were continuous, for -margins, dydx(TCZ)- that will lead to incorrect answers. So you should definitely be using i.TCZ throughout, since those marginal effects are your key results of the analysis. (For the parallel trends work we are currently discussing, it again makes no difference, but it is best to be consistent.)
Comment
Marry Lee

Join Date: Nov 2020

Posts: 189
#11

07 May 2021, 14:13

Thank you again Clyde Schechter. Your remark about the c.TCZ is really important.
Your code works perfectly. It gives two lines for each group and it's about the predictive margins.

I figured out that the first code I suggested, which gives one line for the average marginal effects, would give me the graph with the coefficient for 1997 as 0 (the reference year), if I use only one # instead of 2 # as follows:

Code:

probit Y i.TCZ#ib1997.year_birth X `provinceXyearFE' i.coun, cluster(coun) margins year_birth, dydx(TCZ) noestimcheck post marginsplot, yline(0)

Please note that I also incluse provinceXyear dummies and county dummies to create fixed effects.

If this code is right which of the 2 graphs which of the 2 graphs would be more reliable to use?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30066
#12

07 May 2021, 14:55

If this code is right which of the 2 graphs which of the 2 graphs would be more reliable to use?

In #1 you say that TCZ is 1 in a "region" where the reform is implemented. If "region" refers to country or groups of countries, then the # and ## models will be equivalent. That's because the year_birth indicators will be omitted by virtue of colinearity with the provinceXyear effects, and TCZ will be colinear with the country indicators. But if a "region" is a sub-country unit, then the models are different, and you ought to use the ## version, for reasons explained in #2.
Comment
Marry Lee

Join Date: Nov 2020

Posts: 189
#13

07 May 2021, 16:42

Dear Clyde Schechter,
In fact the country is China. China has provinces which are divided into counties.
The reform is implemented at the county level (so by region I meant county). I have different counties among which some have the reform and others do not. So I think the models with one # or two # are the same right?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30066
#14

07 May 2021, 16:53

Right.

And I'm sorry I misread county as country--I make that mistake a lot!
1 like
Comment
Saharnaz Babaei

Join Date: Apr 2019

Posts: 11
#15

30 Mar 2023, 20:04

Hello, Thank you for this thread and all your responses. How could this analysis be done for a case where there is variation in time of treatment? I need to estimate a DiD using a pooled data (4 waves of a survey - different individuals in each survey). Policy occurs at different times at different states. Observations are at teacher level. How can I do a pre-trend test?
Comment

Announcement