Bivariate panel fixed effects

Kerstin Schmidt

Join Date: Apr 2017

Posts: 120
#1

Bivariate panel fixed effects

08 Feb 2023, 06:32

Dear Statalist,

I am using a panel dataset with 993 observations and 199 ids. The regression I want to run contains a binary dependent variable and a binary endogenous variable. Trying to find an adequate model, I searched in Statalist and found several useful advices:

1. As proposed by Wooldridge: "... with a binary y1 and binary y2, you should use two methods. (1) A standard linear model estimated by 2SLS. This is what Angrist and Pischke propose in "Mostly Harmless Econometrics." (2) Use the so-called "biprobit" model, where y1 and y2 are modeled as probits. This is a joint maximum likelihood procedure. You should compute the average marginal effect from the biprobit and compare it with the 2SLS estimate." (https://www.statalist.org/forums/for...ndent-variable)

2. Searching for a solution to use a bivariate model in a fixed effects panel, I found an advice by Alfonso Sánchez-Peñalver. "For a fixed-effects estimation I'm not sure that with binary dependent variables demeaning the variables by group ... would work, I actually think it won't, so the only suggestion that pops into mind is to create the dummy (binary) variables for ... [id] and run the model with the dummy variables. It may not converge easily, depending on the number of insurees you have in your data, i.e. the number of dummies you would have to include. You can use the regular bivariate probit estimation you did including the dummy variables for this, or cmp as well. They would both work." (https://www.statalist.org/forums/for...variate-probit)

I followed these steps and my bivariate model

Code:

biprobit (Y1 Z P order $x hhid_*) (Y2 Y1 P order $x hhid_*), vce(cluster hhid)

did not converge.

This is different for the cmp command:

Code:

cmp (Y2= Y1 P order $x hhid_*) (Y1 = Z P order $x hhid_*), ind($cmp_probit $cmp_probit ) cluster(hhid) nonrtolerance

It converges but does not compute all SEs and shows "Warning: regressor matrix for I equation appears ill-conditioned. (Condition number = 1674.5519.)"

Yet, when writing

Code:

cmp (Y1 = Z P order $x hhid_*) (Y2= Y1 P order $x hhid_*), ind($cmp_probit $cmp_probit ) cluster(hhid)

it converges, but I guess this is the wrong command.

Do you have any ideas how to sucessfull estimate the bivariate model with a fixed effects panel?
Thanks a lot!
Tags: None
George Ford

Join Date: Aug 2014

Posts: 3156
#2

08 Feb 2023, 08:51

might be helpful

HTML Code:

https://www.stata.com/meeting/mexico13/abstracts/materials/mex13_baum.pdf
Comment

Kerstin Schmidt

Join Date: Apr 2017
Posts: 120

13 Feb 2023, 07:37

Thank you! This was extremely helpful.

I tried to employ the special regressor estimators as proposed by Lewbel (2000) and Dong & Lewbel (2012) as follows:

Outcome variable = Y2
Particular regressor = Age (is exogenous and continuously distributed)

Code:

xtspecialreg Y2 AgeXround, exog(P prevWeather  $x) endog(Y1) iv(Z) first xtivreg2 hetero

Atempting to control for heterogeneity, resulted in:

Code:

. xtspecialreg Y2 AgeXround, exog(P prevWeather  $x) endog(Y1) iv(Z) first xtivreg2 hetero
hetero

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
    __000002 |        199   -.0033198    1.008572  -2.580357   3.158293
hetero

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
    __000002 |        398   -.0017151    1.004374  -2.580357   3.529195
hetero

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
    __000002 |        597    .0003903    1.011655  -2.580357   3.529195
hetero

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
    __000002 |        795    .0013987    1.009967  -2.662135   3.529195
hetero

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
    __000002 |        993    .0004616    1.008598  -2.888774   3.529195

Kurtosis of special regressor AgeXround =    2.6599
Warning: kurtosis below that of N(0,1) = 3.0 may weaken validity of results

24 observations trimmed: max abs value of transformed variable =  5.45 sigma

Warning - singleton groups detected.  1 observation(s) not used.

Therefore, I estimated it without the hetero-option:

Code:

.
. xtspecialreg Y2 AgeXround, exog(P prevWeather  $x) endog(Y1) iv(Z) xtivreg2

Kurtosis of special regressor AgeXround =    2.6682
Warning: kurtosis below that of N(0,1) = 3.0 may weaken validity of results

24 observations trimmed: max abs value of transformed variable =  5.37 sigma


Panel instrumental variables regression            Number of obs =         968
                                                   Number of groups =      198
                                                   Wald chi2(7)  =           .
                                                   Prob > chi2   =           .
------------------------------------------------------------------------------
          Y2 | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
          Y1 |   73.13822   31.30736     2.34   0.019     11.77692    134.4995
           P |  -2.376798   12.59437    -0.19   0.850     -27.0613     22.3077
 prevWeather |    10.4123   7.879352     1.32   0.186     -5.03095    25.85554
          x1 |  -.0133576   .0231342    -0.58   0.564    -.0586998    .0319847
          x2 |  -.0528344   .0245871    -2.15   0.032    -.1010241   -.0046446
          x3 |   31.57397   41.27049     0.77   0.444    -49.31471    112.4626
          x4 |  -.0033615    .019324    -0.17   0.862    -.0412359    .0345128
------------------------------------------------------------------------------
Instrumented : Y1
Instruments:   Z P prevWeather x1 x2 x3 x4

Average marginal effects from average index function

                     Y2
  AgeXround   .00070382
         Y1   .05147585
          P  -.00167283
prevWeather   .00732834
         x1  -9.401e-06
         x2  -.00003719
         x3   .02222227
         x4  -2.366e-06
      _cons   -.1714286

Assuming I cannot find a particular regressor that has a higher kurtosis, would the special regressor estimators still be superior to LPM? How would I report the p-values from the Average Index Function? Take the ownes from the coefficients?

Comment

George Ford

Join Date: Aug 2014

Posts: 3156
#4

13 Feb 2023, 09:09

as noted by Baum, kurtosis is not strictly necessary, but desirable.

Wooldridge says ivreg2 and biprobit. I'd go that route and compare. I'd also do specialreg (if you can get it to work) and compare all the results.

Might also play around with the -imperfectiv- command.
Comment
Kerstin Schmidt

Join Date: Apr 2017

Posts: 120
#5

13 Feb 2023, 09:25

Thanks George!

Unfortunately, the biprobit command does not converge. Therefore, I thought of comparing LPM (ivreg2) and the special regressor estimator (specialreg).

Does it make sense?

Two additional questions:
1. How do implement a special regressor estimator in a fixed effects panel? Use the sspecialreg command and include individual dummies?
2. When reporting the average marginal effects (average index function) of sspecialreg, how do I get the p-values? Take the ones from the coefficient?
Comment
George Ford

Join Date: Aug 2014

Posts: 3156
#6

13 Feb 2023, 09:30

Code:

help xtspecialreg
Comment
George Ford

Join Date: Aug 2014

Posts: 3156
#7

13 Feb 2023, 09:36

The margin command is to compare the effect sizes. Use the t-stat for hypothesis test.

Some advice on the non-convergence.

HTML Code:

https://www.nber.org/stata/efficient/non-linear.html
Comment
Kerstin Schmidt

Join Date: Apr 2017

Posts: 120
#8

13 Feb 2023, 10:08

As I am trying to estimate a fixed effects bivariate model, I include individual dummies into the biprobit command:

Code:

biprobit (Y1 Z P prevWeather order $x hhid_*) (Y2 Y1 P prevWeather order $x hhid_*), vce(cluster hhid)

Yet, this model finally converges with the nonrtolerance-option.

Regarding xtspecialreg: In the help-file I could find any information on the differentiation between random and fixed effects. It pretty much just says, "xtspecialreg performs the same function in the context of panel data which have been tsset or xtset." Therefore, I would assume that it computes the random effects estimatior. What do you think?
Comment
Kerstin Schmidt

Join Date: Apr 2017

Posts: 120
#9

13 Feb 2023, 10:10

The bivariate model also converges without additional info, if dropping the individual dummies. As the test for overidentification computes FE to dominate RE, I should stick with it, I guess...
Comment

George Ford

Join Date: Aug 2014
Posts: 3156

#10

13 Feb 2023, 10:30

xtspecial reg is FE.

HTML Code:

http://fmwww.bc.edu/EC-P/wp604.pdf

Also might look at

Code:

search cmp

HTML Code:

https://www.statalist.org/forums/forum/general-stata-discussion/general/1380907-bivariate-probit-using-panel-data-and-cmp

Comment

Kerstin Schmidt

Join Date: Apr 2017
Posts: 120

#11

13 Feb 2023, 11:12

When applying the cmp-command, I have the same issue:

Code:

cmp (Y2= Y1 P prevWeather order $x i.hhid) (Y1 = Z P prevWeather order $x i.hhid), ///
ind($cmp_probit $cmp_probit) cluster(hhid) nonrtolerance

results in

Code:

Warning: regressor matrix for Y1 equation appears ill-conditioned. (Condition number = 1634.0661.)
This might prevent convergence. If it does, and if you have not done so already, you may need to
remove nearly collinear regressors to achieve convergence. Or you may need to add a
nrtolerance(#) or nonrtolerance option to the command line.
See cmp tips.

Fitting full model.

Iteration 0:   log pseudolikelihood = -457.93937  
Iteration 1:   log pseudolikelihood = -455.07411  
Iteration 2:   log pseudolikelihood = -451.17491  
Iteration 3:   log pseudolikelihood = -449.95849  
Iteration 4:   log pseudolikelihood =  -448.1238  
Iteration 5:   log pseudolikelihood = -447.32157  (not concave)
Iteration 6:   log pseudolikelihood = -447.17221  (not concave)
Iteration 7:   log pseudolikelihood = -447.13033  (not concave)
Iteration 8:   log pseudolikelihood =  -447.0934  (not concave)
Iteration 9:   log pseudolikelihood = -447.06427  (not concave)
Iteration 10:  log pseudolikelihood = -447.03983  (not concave)
Iteration 11:  log pseudolikelihood = -447.02016  (not concave)
Iteration 12:  log pseudolikelihood = -447.00311  (not concave)
Iteration 13:  log pseudolikelihood = -446.98742  (not concave)
Iteration 14:  log pseudolikelihood = -446.97622  (not concave)
Iteration 15:  log pseudolikelihood = -446.96805  (not concave)
Iteration 16:  log pseudolikelihood = -446.96028  (not concave)
Iteration 17:  log pseudolikelihood = -446.95202  (not concave)
Iteration 18:  log pseudolikelihood = -446.94589  (not concave)
Iteration 19:  log pseudolikelihood = -446.94034  (not concave)
Iteration 20:  log pseudolikelihood = -446.93445  (not concave)
Iteration 21:  log pseudolikelihood = -446.93027  (not concave)
Iteration 22:  log pseudolikelihood = -446.92616  (not concave)
Iteration 23:  log pseudolikelihood = -446.92049  
Iteration 24:  log pseudolikelihood =  -446.9064  
Iteration 25:  log pseudolikelihood = -446.89009  
Iteration 26:  log pseudolikelihood = -446.88987  (backed up)
Iteration 27:  log pseudolikelihood = -446.88263  
Iteration 28:  log pseudolikelihood = -446.88157  (not concave)
Iteration 29:  log pseudolikelihood = -446.88048  (not concave)
Iteration 30:  log pseudolikelihood =  -446.8803  (not concave)
Iteration 31:  log pseudolikelihood = -446.88005  (not concave)
Iteration 32:  log pseudolikelihood = -446.87979  (not concave)
Iteration 33:  log pseudolikelihood =  -446.8796  (not concave)
Iteration 34:  log pseudolikelihood = -446.87957  (not concave)

Mixed-process regression                             Number of obs  =      653
                                                     Wald chi2(125) = 3.68e+14
Log pseudolikelihood = -446.87957                    Prob > chi2    =   0.0000

                                 (Std. err. adjusted for 131 clusters in hhid)
------------------------------------------------------------------------------
             |               Robust
             | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
Y2           |
          Y1 |   1.604905   .1462291    10.98   0.000     1.318301    1.891508
           P |   .3873902   .2995286     1.29   0.196    -.1996751    .9744555
 prevWeather |  -.2393983   .1846057    -1.30   0.195    -.6012189    .1224222
       order |  -.2467893   .0478893    -5.15   0.000    -.3406507    -.152928
          x1 |  -.0005024   .0004365    -1.15   0.250    -.0013579    .0003531
          x2 |   .0009599    .000549     1.75   0.080    -.0001161     .002036
          x3 |  -.5879909   1.095742    -0.54   0.592    -2.735605    1.559623
          x4 |  -.0012296   .0004345    -2.83   0.005    -.0020813   -.0003779
             |
         _cons |   4.161078   7.354275     0.57   0.572    -10.25304    18.57519
-------------+----------------------------------------------------------------
Y1           |
           Z |   4.274766          .        .       .            .           .
           P |   .9220619          .        .       .            .           .
 prevWeather |  -.5345329          .        .       .            .           .
       order |    .121923          .        .       .            .           .
          x1 |  -.0015696          .        .       .            .           .
          x2 |     .00456          .        .       .            .           .
          x3 |   1.890515          .        .       .            .           .
          x4 |  -.0011536          .        .       .            .           .
                 
       _cons |  -15.54101          .        .       .            .           .
-------------+----------------------------------------------------------------
/atanhrho_12 |  -11.11293   2.852844    -3.90   0.000     -16.7044   -5.521462
-------------+----------------------------------------------------------------
      rho_12 |         -1   2.54e-09                            -1    -.999968
------------------------------------------------------------------------------

Note: I am only interested in the Y2 estimates.

The cmp and bivariate probit model with the nonrtolerance-option yield similar results. Is it acceptable to use the nonrtolerance-option for both specification and compare it with the estimates from xtspecial?

Comment

George Ford

Join Date: Aug 2014

Posts: 3156
#12

13 Feb 2023, 13:37

It's encouraging they are similar, and yes I'd try xtspecial too. Look at your Xs and see if there's really high correlation among certain variables. If so, drop 1 and try again to see if the Warning appears. But, it did converge, so I wouldn't stress about it.
Comment
Kerstin Schmidt

Join Date: Apr 2017

Posts: 120
#13

13 Feb 2023, 23:31

Thank you!! My hopefully last question:
I don’t fully understand what using ML with the nonrtolerance option means. It is a noniterative ML. Yet, what does this option sacrifice?
Comment
George Ford

Join Date: Aug 2014

Posts: 3156
#14

14 Feb 2023, 07:22

nonrtolerance just tells Stata not to report the non-convergence (if it doesn't) of the gradient. It doesn't mean it converged.

you can change the tolerance using nrtolerance. The default is 0.00001. could set nrtolernce(0.001) and nrtolerance(0.0001) to see if it converges and then compare the coef estimates.

A lot of time, non-convergence comes from weird places. You might try to scale the Xs so that the coefficients from nonrtolerance option are near 1. I'd also look at dropping variables based on high correlation to see if it will converge. I think you'll have to experiment to see why it isn't converging and if it's just a data thing.

Also, you could run the probit model on the main equation (or use the coefs from the nontolerance model) and set initial coefficient values using init(vector).
Comment
Kerstin Schmidt

Join Date: Apr 2017

Posts: 120
#15

14 Feb 2023, 07:47

Thank you!

In my case, when setting nrtolernce(0.001) or nrtolerance(0.0001) yields exactly identical estimates. Does it indicate there is major problem, I need to worry about?
Comment

Announcement