Probit Difference in Difference Model

Abu Tauheed

Join Date: Feb 2017

Posts: 17
#1

Probit Difference in Difference Model

06 Feb 2017, 01:13

Hi All,

I wanted to run a probit Diff-in-Diff model with explanatory variables. I gone through the literature of DiD especially by Atheyand Imbens(2006) and Puhani (2008).

I have a discrete outcome with discrete and continous explanatory variables. The interest of this model is to find the interaction term of treatment and time for identifying the treatment effect. Moreover, as I intend to add co-variates in the model will it have any interpretation or inference issues. Moreover, I will be great full if anyone can provide a STATA command for probit DiD in panel data.

Regards
Tauheed
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

06 Feb 2017, 10:07

You do not say whether you have panel (longitudinal) data or not. It makes a difference. You also do not say whether your time variable is a simple pre-post dichotomy or if you have multiple time periods before and after. Anyway, let me assume that you do not have panel data and that either your time variable is a simple pre-post dichotomy (or that you have calculated a simple pre-post dichotomy variable from your time variable). Then the basic code is

Code:

probit outcome i.treatment##i.pre_post /*INSERT COVARIATES HERE*/

You will probably also want to look at the model-predicted probabilities of the outcomes in both groups in both eras, and the group differences in both eras and time-related changes in both groups.

Code:

margins treatment#pre_post margins treatment, dydx(pre_post) margins pre_post, dydx(treatment)

The use of covariates in the model will not alter the interpretation of the coefficient of the interaction term as the DID estimator of the treatment effect. It will simply mean that your estimate is adjusted for your covariates.
1 like
Comment
Abu Tauheed

Join Date: Feb 2017

Posts: 17
#3

03 May 2017, 01:40

Dear Sir,

Thank you for your reply.

I have a panel data with two time period. As preliminary exercise I defined the post and treatment variable and did the logit model. I have used the following code:

xtlogit (outcome variable) i.treat i.post i.treat#i.post

the stata pop up the note message that due to col-linearity in my treatment variable it is omitted, and the coefficient of interest (interaction term) is not shown.

However, I will try the interaction term as stated by with two ##..
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#4

03 May 2017, 08:13

In a fixed-effects model, it is entirely expected that the treatment variable will be omitted due to colinearity. But the failure of the interaction term to appear suggests that there is something wrong. I don't expect that changing to the ## notation will fix that problem: i.treat##i.post is just translated into i.treat i.post i.treat#i.post.

So assuming you still have this problem, I suggest you post a sample of your data, show the exact commands you are using (including, at a minimum, your -xtset- command, and the -xtlogit- command, and the output you got from Stata. Be sure to read FAQ #12 for instructions on the helpful way to post example data (the -dataex- command) and commands/output (code delimiters).
Comment
Abu Tauheed

Join Date: Feb 2017

Posts: 17
#5

19 May 2017, 03:20

Dear All,

I have a Panel data set for two time periods, the dependent and most of the independent variables are binary in nature. The interest of my work is find treatment effect. Furthermore, I have created two data set, one is simple balanced panel data and second is also a balanced panel, but I have matched the treatment individuals with comparison groups with "propensity score method".
I am trying to run two models
1. First, is "xtlogit" model with interaction terms to calculate the treatment effect in non-linear model.
2. Second the "diff", which is user written command from Juan M. Villa.

I have the following questions:

1. When I am using the xtlogit command with interaction term for treatment effect along with covariates my output results are taking longer time then usual. Moreover, the after reading Puhani 2013 and Norton (2004) which talks about the interpration of interaction term in non-linear, the interpratation becomes complicated.

2. The results with "diff" which is "diff (outcome Variable) t(Treatment Variable) p(Post) cov(variable lists) is easy to interprate but it seems it is linear, which makes my model more like "Linear Probability Model".

Will be correct to use "diff" command for binary outcome variables advocated by Jaun M. Villa in my analysis.

A quick reply will be appreciated
Comment
Fatma Ali

Join Date: Nov 2017

Posts: 2
#6

17 Nov 2017, 10:52

Greeting Dr. Schechter,

I have a small question regarding your code below. How can I obtain the standard errors and the confidence interval of the treatment effect (the DID )? I know from the margins codes below I can calculate the DID but not sure how I would obtain its s.e.

Code:
probit outcome i.treatment##i.pre_post /*INSERT COVARIATES HERE*/
margins treatment#pre_post
margins treatment, dydx(pre_post)
margins pre_post, dydx(treatment)

Many thanks,
Fatma

Last edited by Fatma Ali; 17 Nov 2017, 11:03.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#7

17 Nov 2017, 11:04

Code:

margins pre_post, dydx(treatment) pwcompare(effects)
Comment
Dimitriy V. Masterov

Join Date: Mar 2014

Posts: 609
#8

17 Nov 2017, 11:12

Clyde Schechter,

How do you square this advice with the argument in Puhani's paper that the cross-difference calculated by margins is not the DID parameter of interest in non-linear models? It's quite possible that I misunderstood something in that paper or the argument there is incorrect. Would you be able to comment on that or point to a counter-argument?

Here's the link: The treatment effect, the cross difference, and the interaction term in nonlinear “difference-in-differences” models PA Puhani - Economics Letters, 2012
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#9

17 Nov 2017, 11:34

I have not read that particular paper (it's behind a paywall). I have seen others like it, however, and I simply don't work in the framework discussed there. I view these analyses as models of the probability of the outcome of interest. From a decision making and policy perspective, which is the one I work from, it is the difference in differences of outcome probabilities that is relevant. The regression coefficients are just an indirect way to get to those outcomes.

In any case, it is hard for me to understand how anybody can say that the coefficient of a probit regression in particular is of any use for this purpose. It is essentially uninterpretable in any way that a human brain can wrap itself around. It may be of value for null hypothesis testing, but, as I say, I'm not in the null hypothesis testing business when I do this kind of thing.

The key question is what is meant by "of interest." Puhani and I are interested in different things.
2 likes
Comment
Dimitriy V. Masterov

Join Date: Mar 2014

Posts: 609
#10

17 Nov 2017, 11:59

Clyde Schechter,

Thanks for a quick response. Here's a link to a WP version that should be available to everyone. I don't think Puhani is saying that the index function coefficient is the quantity to consider. I believe he's saying that you should calculate the treatment effect like the second bit of code, and not the first.

Code:

use http://fmwww.bc.edu/repec/bocode/c/CardKrueger1994.dta, clear gen fte_gt_15 = cond(fte>15,1,0) /* Probit Cross-Difference DID (equation 14 in EL paper) */ probit fte_gt_15 i.treated##i.t bk kfc roys , nolog margins treated, dydx(t) pwcompare // same as above, but less clear gen tg = treated*t probit fte_gt_15 i.(treated t tg) bk kfc roys, nolog /* Puhani's DID Estimator (equation 10 in EL paper) */ margins, at(treated==1 t==1 tg ==1) at(treated==1 t==1 tg ==0) contrast(atcontrast(a._at) wald)

I think he also works in the decision making/policy perspective, so I don't think this a purely scholastic argument akin to the cephalophore controversy. In the Card and Krueger data, the difference is slight, but I have seen the choice occasionally matter in my own work. I would love to know which way is correct (especially since the latter is annoying to calculate with margins).
2 likes
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#11

17 Nov 2017, 12:57

Thanks for that link and the example code.

What remains the case, is that Puhani and I are interested in different things.

I am interested in what he defines in equation 1 of his paper. (Sorry that doesn't copy/paste very well here. I tried, but what comes out is worthless.) The thrust of his paper is that in a non linear model, the expression on the right side of equation 1 is not the right definition of treatment effect. In his words "Equation 1 cannot hold" in a model where the response variable is bonded if we assume that the treatment effect is constant across the population.

In my work, constancy of the treatment effect across the population is not a necessary assumption anyway. When you are considering the impact of an intervention/policy, you want to know if the expected utility of adopting the policy will be greater than the expected utility of not adopting it. The utility depends on precisely the expression on the right hand side of equation 1, whether that is considered "the treatment effect" or not. It has never been clear to me why a decision maker would care about any other definition of "treatment effect." If you have a model, linear or otherwise, that is reasonably well fit to the data and if you believe it can be generalized to the setting, population and time at which you are considering introducing your intervention or policy, rational decision making would require you base your decision to adopt or not precisely on the right hand side of equation 1, regardless of how it relates to abstractly defined treatment effect parameters.

As you note, in the example you give, the difference between the two estimators is quite small. I am interested in your observation that you have seen examples in your work whether the difference matters, as I have never seen that in my own work when I have bothered to look at it.

In any case, my bottom line is that when you approach it from a decision theoretic, expected utility based viewpoint, the right hand side of equation (1) is the statistic of interest, and the right hand side of equation (10), which he considers to be the treatment effect, is not of interest.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#12

19 Nov 2017, 12:59

It dawns on me, that about 6 weeks ago, this question came up in another thread, and Joseph Coveney and I engaged in a dialog about this. It starts at #23 in https://www.statalist.org/forums/for...it-model/page2.
Comment
Fatma Ali

Join Date: Nov 2017

Posts: 2
#13

27 Nov 2017, 11:54

Clyde Schechter,

Thank you, Dr. Schechter, for your response. That was really helpful. My apologies for my late replay. I am new to this list and I am still learning its techniques.
Many thanks,
Fatma
Comment
River Huang

Join Date: Mar 2016

Posts: 1908
#14

05 Oct 2018, 03:04

Dear Dimitriy, Do you happen to know how to estimate the treatment effect of a standard DID setup but the dependent variable is censored?

Ho-Chuan (River) Huang
Stata 19.0, MP(4)
Comment
Dimitriy V. Masterov

Join Date: Mar 2014

Posts: 609
#15

08 Oct 2018, 10:32

River Huang D-in-D is really the just the difference between 4 means, so any model that estimates the expected value can be turned into a D-in-D estimator. There's a post with a toy simulation on this here.
Comment

Announcement

Probit Difference in Difference Model

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment