Difference in difference ( logit model)

Neeraj Kumar

Join Date: Jul 2017

Posts: 98
#1

Difference in difference ( logit model)

08 Sep 2017, 20:43

Hello,
I am using a difference in difference method with logit regression. I am using 2 year panel data. I am running a following command with and without odd ratio. Even I go through with some paper based on non linear dependent variable in case of difference in difference. I didn't that much information about odd ratio. Even in some study they mention that it is very complicating to explain odd ratio. So they just explain the coefficient of logit model.

1.

Code:

xtlogit loan i.mgnregadmy##i. time other covariates, or

2.

Code:

xtlogit loan i.mgnregadmy##i. time other covariates

.

And I am using following margins command also
3.

Code:

margins mgnregadmy#time

4.

Code:

margins mgnregadmy, dydx(time)

From 1 and 2 command which command is suitable. And how we interpret the interaction term.
Tags: None

1 like
Joseph Coveney

Join Date: Apr 2014

Posts: 4421
#2

08 Sep 2017, 21:28

If it were me, I would interpret the logistic regression fit in terms of the linear predictor. Would it be any easier (more familiar-looking) if you modeled the phenomenon in terms of risk differences?

Code:

xtset <whatever> xtgee loan i.mgnregadmy##i.time <other covariates>, family(binomial) link(identity)
Comment
Neeraj Kumar

Join Date: Jul 2017

Posts: 98
#3

09 Sep 2017, 04:05

Thanks for your prompt reply but while running #2 command

Code:

xtgee Loan20 i.mgnregadmy##i. time RO5 ca2 ca3 education1 NPERSONS COPC, family (binomial) link (identity)

showing the following error after getting the results.
convergence not achieved
r(430);
What is the difference between xtlogit command and xtgee command
Comment
River Huang

Join Date: Mar 2016

Posts: 1908
#4

09 Sep 2017, 04:49

Can we really (directly) use -xtlogit- or -xtgee- to the analysis of difference-in-differences with binary/discrete outcomes?

Ho-Chuan (River) Huang
Stata 19.0, MP(4)
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4421
#5

09 Sep 2017, 04:51

If it doesn't converge, then it looks like you're stuck with xtlogit, followed by margins, post and then something like lincom for the risk difference.
Comment
Neeraj Kumar

Join Date: Jul 2017

Posts: 98
#6

09 Sep 2017, 07:17

Thanks for your reply. I want to know how to interpret the interaction term in xtlogit model and as well as does margins interpretation will be same as interpretation in simple regression with continuous dependent variable. Should I go with xtlogit with odds ratio (or) or simple xtlogit asked in #1 in equations 1 & 2.
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4421
#7

09 Sep 2017, 17:16

Originally posted by River Huang View Post

Can we really (directly) use -xtlogit- or -xtgee- to the analysis of difference-in-differences with binary/discrete outcomes?

Why do you ask? Are you referring to the noncollapsibility of the odds ratio or something?
Comment
River Huang

Join Date: Mar 2016

Posts: 1908
#8

09 Sep 2017, 18:39

My question is that: Do we need additional assumptions and/or alternative estimation approaches for such "nonlinear" DID model? Please see http://onlinelibrary.wiley.com/doi/1...668.x/abstract for discussion of "discrete" outcomes.

Ho-Chuan (River) Huang
Stata 19.0, MP(4)
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4421
#9

09 Sep 2017, 19:40

It's behind a paywall; are you referring to assumptions that have to be entertained when attempting causal inference from an observational study? The OP's stated question was about whether one uses the exponentiated coefficient in assessing an interaction term in a logistic regression model.
Comment
River Huang

Join Date: Mar 2016

Posts: 1908
#10

09 Sep 2017, 21:40

Dear Joseph: I am aware of the OP's stated question was about whether one uses the exponentiated coefficient in assessing an interaction term in a logistic regression model. My question is a little bit different (but related). In the (common) case of continuous outcomes, the coefficient on the interaction term in the DID model measures the treatment effect (of interest). In contrast, this seems not to be the case if we have binary/discrete outcomes. So. I wonder if it makes any sense to use -margins- command for measuring the effect of the interaction term (whether it is odd ratio or not?)? Please see Puhani, P. A. (2012), "The treatment effect, the cross difference, and the interaction term in nonlinear difference-in-differences models." Economics Letters, 115, 85-87.

Ho-Chuan (River) Huang
Stata 19.0, MP(4)
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#11

09 Sep 2017, 22:01

You can do this:

Code:

logit i.treatment##i.pre_post other_covariates margins treatment, dydx(pre_post) pwcompare

The contrast between the marginal effect of pre_post in the treatment and control groups is the average treatment effect in the probability metric: it is the difference in differences of the outcome probabilities. If a decision maker wants to evaluate a policy, and if the number or proportion of people (firms, entities, whatever) in the entire population that experience a positive outcome is a suitable utility metric, then it is this difference-in-differences that he or she would be interested in.

Of course, like any other effect in a non-linear model (it is the non-linearity of the model that is relevant here, not the discreteness of the outcome) the average marginal effect may not be a particularly useful statistic, depending on what your goals are. You may have greater need of marginal effects at particular levels of the covariates. But that can be accommodated as well by just adding an appropriate -at()- option to the -margins- command.
1 like
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4421
#12

09 Sep 2017, 23:57

Originally posted by River Huang View Post

I wonder if it makes any sense to use -margins- command for measuring the effect of the interaction term (whether it is odd ratio or not?)?

It's when you try to back away from interpreting the logistic model in terms of the metric in which it's fit, seeking the comfort of the familiar (proportions), that you run smack into the problem of nonlinearity and its complexity of interpretation. That's why I said that if it were up to me, I would interpret the logistic regression fit in terms of the linear predictor.
Comment

Neeraj Kumar

Join Date: Jul 2017
Posts: 98

#13

10 Sep 2017, 13:28

Thank you so much for suggestions. I am running the commands which are mention in #11. i got the results. I just want to be sure whether my assessment is correct or not. Interaction variable shows the probability of getting formal loan (Loan20, 1= formal loans, 0= informal loans ) after treatment (time=1) for treatment group (mgnregadmy=1) is 11.95 percent unit. The time coefficient is 0.3915. which show the increase in the probability of getting formal loan for those who are not participating in policy program.
The coefficient of dy/dx is 0.019 which shows the difference in differences of the outcome probabilities (outcome probabilities means probability of time and probability of mgnregadmy ). Please correct me if i am wrong.

HTML Code:

  	 		 			. logit Loan20 i.mgnregadmy##i. time RO5 ca2 ca3 education1 NPERSONS COPC livestock 			POOR, vce(rob 		 		 			> ust) 		 		 			Iteration 0: log pseudolikelihood = -11031.994 		 		 			Iteration 1: log pseudolikelihood = -10193.767 		 		 			Iteration 2: log pseudolikelihood = -10186.178 		 		 			Iteration 3: log pseudolikelihood = -10186.177 		 		 			Iteration 4: log pseudolikelihood = -10186.177 		 		 			Logistic regression Number of obs = 16662 		 		 			Wald chi2(11) = 1421.00 		 		 			Prob > chi2 = 0.0000 		 		 			Log pseudolikelihood = -10186.177 Pseudo R2 = 0.0767 		 		 			  		 		 			Robust 		 		 			Loan20 Coef. Std. Err. z P>z [95% Conf. Interval] 		 		 			1.mgnregadmy -.3482037 .0542846 -6.41 0.000 -.4545995 -.2418078 		 		 			1.time .3482228 .0495039 7.03 0.000 .251197 .4452486 		 		 			  		 		 			mgnregadmy#time 		 		 			1 1 .1016059 .0692459 1.47 0.142 -.0341136 .2373253 		 		 			  		 		 			RO5 .0213121 .0014136 15.08 0.000 .0185416 .0240827 		 		 			ca2 -.3404666 .0414663 -8.21 0.000 -.4217392 -.2591941 		 		 			ca3 -.4354486 .0484396 -8.99 0.000 -.5303884 -.3405088 		 		 			education1 .7019426 .0365355 19.21 0.000 .6303344 .7735508 		 		 			NPERSONS .0439818 .006913 6.36 0.000 .0304326 .0575311 		 		 			COPC .0000909 .0000148 6.13 0.000 .0000618 .00012 		 		 			livestock .0136975 .0043885 3.12 0.002 .0050962 .0222989 		 		 			POOR -.3530032 .0491351 -7.18 0.000 -.4493063 -.2567001 		 		 			_cons -2.132127 .0961263 -22.18 0.000 -2.320531 -1.943723 		 		 			  		 		 			. logit Loan20 i.mgnregadmy##i. time RO5 ca2 ca3 education1 NPERSONS COPC livestock 			, vce(robust) 		 		 			Iteration 0: log pseudolikelihood = -11031.994 		 		 			Iteration 1: log pseudolikelihood = -10220.243 		 		 			Iteration 2: log pseudolikelihood = -10213.063 		 		 			Iteration 3: log pseudolikelihood = -10213.062 		 		 			Logistic regression Number of obs = 16662 		 		 			Wald chi2(10) = 1358.42 		 		 			Prob > chi2 = 0.0000 		 		 			Log pseudolikelihood = -10213.062 Pseudo R2 = 0.0742 		 		 			  		 		 			Robust 		 		 			Loan20 Coef. Std. Err. z P>z [95% Conf. Interval] 		 		 			1.mgnregadmy -.3807986 .0540135 -7.05 0.000 -.4866631 -.2749341 		 		 			1.time .3195498 .0496296 6.44 0.000 .2222777 .4168219 		 		 			  		 		 			mgnregadmy#time 		 		 			1 1 .1195546 .0690527 1.73 0.083 -.0157863 .2548954 		 		 			  		 		 			RO5 .0217411 .0014097 15.42 0.000 .0189781 .024504 		 		 			ca2 -.3495559 .0415363 -8.42 0.000 -.4309655 -.2681462 		 		 			ca3 -.4694733 .048351 -9.71 0.000 -.5642394 -.3747072 		 		 			education1 .7198696 .0364544 19.75 0.000 .6484203 .7913189 		 		 			NPERSONS .0364292 .00683 5.33 0.000 .0230427 .0498158 		 		 			COPC .0001138 .0000158 7.19 0.000 .0000828 .0001449 		 		 			livestock .0141965 .0044842 3.17 0.002 .0054077 .0229853 		 		 			_cons -2.178086 .0960011 -22.69 0.000 -2.366245 -1.989927 		 		 			  		 		 			. margins mgnregadmy, dydx(time) pwcompare 		 		 			Pairwise comparisons of average marginal effects 		 		 			Model VCE : Robust 		 		 			Expression : Pr(Loan20), predict() 		 		 			dy/dx w.r.t. : 1.time 		 		 			  		 		 			Contrast Delta-method Unadjusted 		 		 			dy/dx Std. Err. [95% Conf. Interval] 		 		 			1.time 		 		 			mgnregadmy 		 		 			1 vs 0 .0194256 .0145997 -.0091893 .0480404 		 		 			Note: dy/dx for factor levels is the discrete change from the 		 		 			base level. 		 		 			.

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#14

10 Sep 2017, 13:34

Neeraj, you posted your code in an HTML block instead of a code block. It's all strung out in a long line and is, for practical purposes, unreadable. Please repost it properly in a code block so everybody can see what you're talking about.
Comment
Neeraj Kumar

Join Date: Jul 2017

Posts: 98
#15

10 Sep 2017, 13:35

sorry, here is the png files of the results . which are interpreted in #13
Attached Files
Comment

Announcement