Control function approach with logit and squared term

Lukas Lang

Join Date: Dec 2016

Posts: 42
#1

Control function approach with logit and squared term

09 Jun 2022, 09:29

Hi,

Consider the following model:

yvar = a + b1*x1var + b2*x1var^2 + b3*x2var + b4'*controls + error (eq 1)

As yvar is binary, I estimate (eq 1) by logit.

x1var and x2var are continuous and endogenous independent variables.

I also have z1var and z2var instrumental variables: z1var is an instrument for x1var and z2var is an instrument for x2var.

Therefore, I would like to implement a control function approach to account for the endogeneity of both x1var (and x1var squared) and x2var.

I have read Wooldridge textbook and I have read multiple posts here, but I am still having some difficulties.

Basically, I have tried two different control functions to estimate (eq 1) and I get very different results, unexpectedly.

First, I estimate a plain and simple control function (CF1) as follows:

Code:

*first stage reg x1var z1var z2var ${controls} predict resid_1, res reg x2var z1var z2var ${controls} predict resid_2, res *second stage logit yvar c.x1var##c.x1var x2var ${controls} resid_1 resid_2 *both stages are bootstrapped to get correct standard errors

Then, I try a more flexible control function (CF2) as follows:

Code:

*first stage gen x1var_2=x1var^2 reg x1var c.z1var##c.z1var z2var ${controls} predict resid_1, res reg x1var_2 c.z1var##c.z1var z2var ${controls} predict resid_2, res reg x2var c.z1var##c.z1varz2var ${controls} predict resid_3, res *second stage logit yvar c.x1var##c.x1var x2var ${controls} resid_1 resid_2 resid_3 *both stages are bootstrapped to get correct standard errors

However, the results that I obtain when using CF1 or CF2 are completely different in terms of sign, magnitude and statistical significance.

In principle, I would prefer CF2 as the control function is more flexible.

However, I am uncertain whether there is something wrong with CF2.

Do you see any obvious reason why the two control functions CF1 and CF2 produce completely different results? Which control functions would you prefer?

Thanks,

Lukas

Last edited by Lukas Lang; 09 Jun 2022, 09:35. Reason: fixing typos

------
I use Stata 17
Tags: None
FernandoRios

Join Date: Apr 2014

Posts: 2470
#2

09 Jun 2022, 09:48

Hi Lukas,
what exactly are you using to compare the marginal effects of both models. Can you, for example, post the marginal effects of x1var in both models?
Certainly coefficients will change, but what it is important here are those marginal effects
F
Comment
Lukas Lang

Join Date: Dec 2016

Posts: 42
#3

09 Jun 2022, 09:54

Thank you FernandoRios. When I say that results are different I refer exactly to the average marginal effects of x1var. This is how I compute these average marginal effects.

Code:

margins, dydx(x1var) at(x1var=`min value' x1var=`mean value' x1var=`max value')

Min, mean and max are my values of interest for the average marginal effects. Hope it makes sense.

------
I use Stata 17
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2470
#4

09 Jun 2022, 10:54

and can you show the exact numbers you get?
Comment

Lukas Lang

Join Date: Dec 2016
Posts: 42

13 Jun 2022, 07:15

Thanks FernandoRios

When using CF1 I obtain these results:

Code:

margins, dydx(la_exp) at(la_exp=0 la_exp=`min' la_exp=`mean' la_exp=`max') post

Average marginal effects                                Number of obs = 39,115
Model VCE: Robust

Expression: Pr(houtcome), predict()
dy/dx wrt:  la_exp
1._at: la_exp = 16.51067
2._at: la_exp = 26.61623
3._at: la_exp = 47.27143

-----------------------------------------------------------------------------------
                  |            Delta-method
                  |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
------------------+----------------------------------------------------------------
la_exp            |
              _at |
               1  |  -.0112055   .0032717    -3.42   0.001     -.017618   -.0047931
               2  |  -.0194655   .0053244    -3.66   0.000    -.0299011   -.0090299
               3  |   .0001215   .0075195     0.02   0.987    -.0146164    .0148594
-----------------------------------------------------------------------------------

When I use CF2 I get:

Code:

margins, dydx(la_exp) at(la_exp=0 la_exp=`min' la_exp=`mean' la_exp=`max') post

Average marginal effects                                Number of obs = 39,115
Model VCE: Robust

Expression: Pr(houtcome), predict()
dy/dx wrt:  la_exp
1._at: la_exp = 16.51067
2._at: la_exp = 26.61623
3._at: la_exp = 47.27143

-----------------------------------------------------------------------------------
                  |            Delta-method
                  |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
------------------+----------------------------------------------------------------
la_exp            |
              _at |
               1  |  -.0008915   .0011282    -0.79   0.429    -.0031027    .0013197
               2  |  -.0372457   .0043453    -8.57   0.000    -.0457623   -.0287292
               3  |  -.0162769   .0040293    -4.04   0.000    -.0241742   -.0083795
-----------------------------------------------------------------------------------

So, as you can see, results change quite a lot especially when thinking about the interpretation.

While I do not have any prior on the magnitude of the effect, as this is the effect of health care insurance on a measure of health outcome, the negative sign is what I would expect.

What confuses me is that at the minimum and maximum value the results are different bust still plausible in both CF1 and CF2 case.

So, my conclusions about which could be the best model are highly uncertain.

Any thoughts about how else I can assess the validity of these two models?

Last edited by Lukas Lang; 13 Jun 2022, 07:34.

------
I use Stata 17

Comment

FernandoRios

Join Date: Apr 2014

Posts: 2470
#6

13 Jun 2022, 07:38

well a few thoughts on your results
1) when using quadratic terms (as you did), you do not need to estimate a first stage for the quadratic term as well. I believe that is common practice for two-step substitution approach, but not residual inclusion approach.
2) you could add more flexibility, for example, adding functional forms of the IMR.
3) It seems, to me, that our results are rather consistent for the mean. The magnitude is almost double, yes, but it could be just due to specification.
4) very few models perform well around the limits of the variables of interest. So I wouldn't put much weight on evaluating the effect at x=max and x=min
HTH
Comment
Lukas Lang

Join Date: Dec 2016

Posts: 42
#7

13 Jun 2022, 08:13

Thank you, your suggestions are helpful!

------
I use Stata 17
Comment

Announcement