Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multinomial Probit model with continuous endogenous regressors

    Hi there

    May I ask if there are stata commands for running multinomial probit model with continuous endogenous regressors?
    I would like to solve endogeneity problem in a multinomial probit model. Thank you.


  • #2
    Check whether user written -cmp- by David Roodman is not doing what you want.

    Comment


    • #3
      Thank you.

      Should it look like this?

      cmp (y=x1 x2) (x2=x1 z), ind($cmp_mprobit $cmp_cont).

      Comment


      • #4
        Looks about right, but read the help file, and most importantly try it to see what Stata would say.

        Originally posted by Lok Man Tong View Post
        Thank you.

        Should it look like this?

        cmp (y=x1 x2) (x2=x1 z), ind($cmp_mprobit $cmp_cont).

        Comment


        • #5
          Hi Joro

          May I ask further?

          My original mprobit model involves a setting of initial spec, let say "from(x1=0)".

          I don't know how to incorporate this into cmp. I have tried the following, but it doesn't work. The Stata said "= invalid name / invalid syntax"

          cmp (y=x1 x2 x3) (x2=x1 x3 z), ind($cmp_mprobit $cmp_cont) from(x1=0)

          Looking forward to hearing from you. Thank you very much.

          Comment


          • #6
            Why would you like to do this?

            Reading the help of -mprobit-, I see that

            "maximize options: difficult, technique(algorithm spec), iterate(#),
            
            no
            
            log, trace,
            gradient, showstep, hessian, showtolerance, tolerance(#), ltolerance(#),
            nrtolerance(#), nonrtolerance, and from(init specs); see [R] maximize. These options are
            seldom used."

            If -cmp- converges, do not fiddle with the optimisation.

            Commands do not generally take any option we have seen anywhere else in other commands. When you use this optimisation option from -mprobit- , -cmp- does not know what you are talking about.

            Originally posted by Lok Man Tong View Post
            Hi Joro

            May I ask further?

            My original mprobit model involves a setting of initial spec, let say "from(x1=0)".

            I don't know how to incorporate this into cmp. I have tried the following, but it doesn't work. The Stata said "= invalid name / invalid syntax"

            cmp (y=x1 x2 x3) (x2=x1 x3 z), ind($cmp_mprobit $cmp_cont) from(x1=0)

            Looking forward to hearing from you. Thank you very much.

            Comment


            • #7
              I need to set up from(init specs) in the original mprobit model, otherwise it does not converge.

              I tried to remove it in -cmp-, but the model does not converge.

              Are there alternative functions / options in -cmp-?

              Thank you.


              Comment


              • #8
                This is bad... Try the -difficult- option.

                From the help file, this is what -cmp- accepts:

                " ml_opts: cmp accepts the following standard ml options, which affect the full-model and initial, single-equation fits: trace gradient hessian
                showstep technique(algorithm_specs) vce(oim|opg|robust|cluster) iterate(#) tolerance(#) ltolerance(#) gtolerance(#) nrtolerance(#) nonrtolerance
                shownrtolerance difficult constraints(numlist|matname)
                "

                Originally posted by Lok Man Tong View Post
                I need to set up from(init specs) in the original mprobit model, otherwise it does not converge.

                I tried to remove it in -cmp-, but the model does not converge.

                Are there alternative functions / options in -cmp-?

                Thank you.

                Comment


                • #9
                  Very bad.

                  I added the "difficult" option, it does not converge.

                  So, I also included "init(vector)", but the processing time is incredibly long. The Stata has been running for hours, no results or errors come out.

                  Is it possible to speed up the process? Many thanks.

                  Comment


                  • #10
                    If -cmp- accepted init(vector) then you are doing what you want to do, you are passing on the initial values.

                    I would suggest that you make sure you understand how init(vector) works on a simple example which converges, just to make sure you are not passing on the initial values in the wrong way.

                    Another suggestion is to alternatingly delete say one/two observations, and see whether the model converges.

                    And I do not think that you can speed up the process if it is calculating. There is always the possibility that it is calculating some nonsense because you have passed init(vector) incorrectly, or just because the optimiser got stuck in some flat/nonconcave region.

                    But the only thing you can do for calculations that take too long (and you have run out of options) is to let this run in the evening and go to bed, and see what has come out in the morning.

                    Originally posted by Lok Man Tong View Post
                    Very bad.

                    I added the "difficult" option, it does not converge.

                    So, I also included "init(vector)", but the processing time is incredibly long. The Stata has been running for hours, no results or errors come out.

                    Is it possible to speed up the process? Many thanks.

                    Comment


                    • #11
                      Originally posted by Lok Man Tong View Post
                      Thank you.

                      Should it look like this?

                      cmp (y=x1 x2) (x2=x1 z), ind($cmp_mprobit $cmp_cont).
                      You may want to replace (y = x1 x2) with (y = x1 x2, iia). As to why, please refer to the Keane [1992] reference in the -cmp- help file.

                      Comment


                      • #12
                        This looks like a job for the control function approach. It requires a simple first step, and then, I guess, the cmmprobit command. But I've never used cmmprobit. So here is a guess:

                        Code:
                        reg x2 x1 z
                        predict v2h, resid
                        cmmprobit y x1 x2 v2h
                        This entails a generated regressor, v2h, and so you should adjust the standard errors. However, the t test on v2h is a valid test of the null that x2 is exogenous. Of the cmmprobit does not take too long to run, I would just bootstrap the two-step procedure. The analytical standard errors are more difficult to work out.

                        To obtain marginal effects of x1 and x2, average out v2h across the sample. Again, a bootstrap can provide proper standard errors.

                        JW

                        Comment


                        • #13
                          Originally posted by Joro Kolev View Post
                          If -cmp- accepted init(vector) then you are doing what you want to do, you are passing on the initial values.

                          I would suggest that you make sure you understand how init(vector) works on a simple example which converges, just to make sure you are not passing on the initial values in the wrong way.

                          Another suggestion is to alternatingly delete say one/two observations, and see whether the model converges.

                          And I do not think that you can speed up the process if it is calculating. There is always the possibility that it is calculating some nonsense because you have passed init(vector) incorrectly, or just because the optimiser got stuck in some flat/nonconcave region.

                          But the only thing you can do for calculations that take too long (and you have run out of options) is to let this run in the evening and go to bed, and see what has come out in the morning.


                          Thank you, Joro.
                          I tried best to deal with the initial values. Unfortunately, another error came out. But my computer has 32GB ram, sounds not very bad.
                          Fitting constant-only model for LR test of overall model fit.
                          #: 3900 out of memory
                          halton2(): - function returned error
                          ghk2setup(): - function returned error
                          cmp_model::cmp_init(): - function returned error
                          <istmt>: - function returned error
                          Mata run-time error
                          r(3900);

                          Comment


                          • #14
                            Originally posted by Hong Il Yoo View Post

                            You may want to replace (y = x1 x2) with (y = x1 x2, iia). As to why, please refer to the Keane [1992] reference in the -cmp- help file.
                            Thank you, Hong Il.
                            I tried it, but it didn't work.

                            Comment


                            • #15
                              Originally posted by Jeff Wooldridge View Post
                              This looks like a job for the control function approach. It requires a simple first step, and then, I guess, the cmmprobit command. But I've never used cmmprobit. So here is a guess:

                              Code:
                              reg x2 x1 z
                              predict v2h, resid
                              cmmprobit y x1 x2 v2h
                              This entails a generated regressor, v2h, and so you should adjust the standard errors. However, the t test on v2h is a valid test of the null that x2 is exogenous. Of the cmmprobit does not take too long to run, I would just bootstrap the two-step procedure. The analytical standard errors are more difficult to work out.

                              To obtain marginal effects of x1 and x2, average out v2h across the sample. Again, a bootstrap can provide proper standard errors.

                              JW
                              Thank you, Jeff.
                              Yes, I have to work on the control function approach. May I check with you if the following codes are on the right track? CMMPROBIT is a new feature of Stata 16. I haven't upgraded it yet. So, I tried to replace cmmprobit with mprobit and see how it goes first. They worked. I still want to combine two bootstrap programs to save estimation time, but failed to do so. Hope to have some suggestions. Many thanks.

                              *Bootstrap standard errors of mprobit
                              capture program drop mprobendo
                              program mprobendo, eclass

                              //step 1 OLS//
                              reg x2 x1 z
                              capture drop v2h
                              predict v2h, residuals

                              //step 2 CMMPROBIT//
                              cmmprobit y x1 x2 v2h
                              tempvar bsbse
                              tempname bsb
                              matrix `bsb'=e(b)
                              quietly gen byte `bsbse1' = e(sample)
                              ereturn post `bsb', esample(`bsbse')
                              end

                              set seed 1
                              bootstrap _b, reps(50): mprobendo

                              *Bootstrap marginal effects (rclass)
                              capture program drop mprobme
                              program mprobme, rclass

                              //step 1 OLS//
                              reg x2 x1 z
                              capture drop v2h
                              predict v2h, residuals

                              //step 2 CMMPROBIT//
                              cmmprobit y x1 x2 v2h
                              margins, dydx(*) at((mean)v2h) predict(outcome(1))
                              matrix list r(b)
                              tempname M1
                              matrix `M1'=r(b)
                              local M1_cols=colsof(`M1')
                              forvalues j=1/`M1_cols'{
                              return scalar margin1_`j'=`M1'[1,`j']
                              }
                              margins, dydx(*) at((mean)v2h) predict(outcome(2))
                              matrix list r(b)
                              tempname M2
                              matrix `M2'=r(b)
                              local M2_cols=colsof(`M2')
                              forvalues j=1/`M2_cols'{
                              return scalar margin2_`j'=`M2'[1,`j']
                              }
                              margins, dydx(*) at((mean)v2h) predict(outcome(3))
                              matrix list r(b)
                              tempname M3
                              matrix `M3'=r(b)
                              local M3_cols=colsof(`M3')
                              forvalues j=1/`M3_cols'{
                              return scalar margin3_`j'=`M3'[1,`j']
                              }
                              end

                              set seed 1
                              bootstrap m11=r(margin1_1) m12=r(margin1_2) m21=r(margin2_1) m22=r(margin2_2) m31=r(margin3_1) m32=r(margin3_2), reps(50): mprobme

                              Comment

                              Working...
                              X