Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • ivprobit and cmp ivprobit

    Hi. I am using an individual-based survey and I am trying to estimate the impact of migration and remittances on child education in Egypt. Therefore, I have two main equations: the first one studies the effect of migration (dummy independent variable "migrant" = 1 if the individual reported a migrant in his/her household, =0 otherwise) on child education (dummy dependent variable "school" = 1 if the individual attends school, =0 otherwise). The second equation studies the effect of remittances (dummy independent variable "remit" = 1if the individual reported receiving remittances in his/her household, =0 otherwise) on child education (dummy "school" variable). I am planning to use cmp ivprobit regression, but since this is my first time to ever use this type of regression, I have a few questions and I would really appreciate your help:

    1. Examining the literature, I have noticed that some economists only used ivprobit, while others used cmp ivprobit. Based on what I understood, cmp ivprobit would be more convenient in my case since it allows for errors in different equations to be correlated. Is this correct? should I use cmp ivprobit instead of ivprobit?
    2. I have already tried to run ivprobit and cmp ivprobit, but I got some errors on stata. I have checked the help commands but I am still not sure what exactly my error is. Can someone tell me what is wrong with the following commands?

    this is the ivprobit command for the school-migrant equation:
    Code:
    #delimit ;
    ivprobit school [migrant age age2 eldest i.fteducst i.mteducst fth_absent urban1] (migrant = oilpricewhenmigrantis31) 
    [if age >= 6 & age <= 17 & marital != 4 & marital != 5 & yrbirth1 ==.] [pweight=expan_indiv], vce (cluster hhid)first;
    #delimit cr 
    margins, dydx(*) predict(pr)
    this is the error I get
    Code:
     migrant unknown weight type
    this is the cmp ivprobit command
    Code:
    #delimit ; 
    cmp (migrant = oilpricewhenmigrantis31 age age2 eldest i.fteducst i.mteducst fth_absent urban1) (school = migrant age age2 eldest i.fteducst i.mteducst fth_absent urban1) 
    [if age >= 6 & age <= 17 & marital != 4 & marital != 5 & yrbirth1 ==.] [pweight=expan_indiv], vce (cluster hhid) indicators ($cmp cont $cmp probit);
    #delimit cr
    margins, dydx(*) predict(pr) force
    this is the error I get
    Code:
    weights not allowed
    invalid syntax
    3. kindly note that my iv is: "oilpricewhenmigrantis31" which is the oil price when the migrant is 31 years old. Should I consider it as left-censored variable instead of continuous variable in the cmp command?
    4. There are some control variables that I would like to add in the IV equation. I am wondering how this can be done. should I just add them after my instrumental variable in the IV equation?
    5. If I end up choosing the cmp ivprobit and I would like to run another probit regression assuming my independent variables are exogenous. Does it also have to be cmp probit?
    6. When doing my analysis at the end, should I only focus on the coefficients of the marginal effects?
    I apologize for my many questions. I decided to gather all my problems in one post and, as a beginner, I would really appreciate your help!

  • #2
    Hello Salma,

    I suspect part of the problem you are having is using square brackets for the coditional -if- statement. Square brackets in the Stata syntax are reserved for weights. If you want to specify multiple endogenous equations with different covariates, I would recommend that you use -eprobit-. Here is an example where my binary outcome is the variable -foreign- and I conjecture -price- and -trunk- are endogenous. I have an equation for -price- and -trunk-, each one with a different set of instruments. I have also included and -if- statement and some made-up weights:

    Code:
    . sysuse auto, clear 
    (1978 Automobile Data)
    
    . generate peso = 1
    
    . eprobit foreign mpg if (rep78!=. & displacement>0) [pw=peso],           ///
    >     endogenous(price = mpg weight) endogenous(trunk = mpg length)
    
    Iteration 0:   log pseudolikelihood = -822.04374  
    Iteration 1:   log pseudolikelihood = -822.03581  
    Iteration 2:   log pseudolikelihood =  -822.0358  
    
    Extended probit regression                      Number of obs     =         69
                                                    Wald chi2(3)      =      55.51
    Log pseudolikelihood =  -822.0358               Prob > chi2       =     0.0000
    
    -----------------------------------------------------------------------------------------
                            |               Robust
                            |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ------------------------+----------------------------------------------------------------
    foreign                 |
                        mpg |  -.0182967   .0340447    -0.54   0.591    -.0850232    .0484297
                      price |  -.0003316    .000059    -5.62   0.000    -.0004474   -.0002159
                      trunk |   .0613962   .0405431     1.51   0.130    -.0180668    .1408592
                      _cons |   1.349108   1.403826     0.96   0.337     -1.40234    4.100557
    ------------------------+----------------------------------------------------------------
    price                   |
                        mpg |  -19.81308   88.52969    -0.22   0.823    -193.3281    153.7019
                     weight |   1.897249   .7492304     2.53   0.011     .4287844    3.365714
                      _cons |    815.347   3978.292     0.20   0.838    -6981.962    8612.656
    ------------------------+----------------------------------------------------------------
    trunk                   |
                        mpg |   .0214285   .1196396     0.18   0.858    -.2130607    .2559178
                     length |   .1446218   .0304099     4.76   0.000     .0850195    .2042242
                      _cons |   -13.7595   8.091488    -1.70   0.089    -29.61853    2.099526
    ------------------------+----------------------------------------------------------------
                var(e.price)|    5845390    1159545                       3962428     8623144
                var(e.trunk)|    8.60871   1.624719                      5.946906    12.46192
    ------------------------+----------------------------------------------------------------
     corr(e.price,e.foreign)|   .9716002   .0246145    39.47   0.000     .8506534    .9948709
     corr(e.trunk,e.foreign)|   -.168159   .1642041    -1.02   0.306    -.4628805    .1600404
       corr(e.trunk,e.price)|   .0253499   .0971993     0.26   0.794    -.1637856    .2126879
    -----------------------------------------------------------------------------------------
    You also ask about using -probit- in case of exogeneity. You could clearly do this. Notice, however, that the correlations at the bottom of the -eprobit- results serve as a test for endogeneity. In this case there is evidence to suggest that -price- is endogenous and -trunk- is not. All this to say, -eprobit- gives you guidance about specifications with and without endogeneity.

    Finally, you ask about effects. In particular you ask about average marginal effects/ average partial effects. If you want them to have a structural interpretation (which is the most useful way of thinking about these effects), as we have discussed in this forum, I would type:


    Code:
    . generate pt = price 
    
    . generate tt = trunk 
    
    . generate wt = weight 
    
    . generate lt = length
    
    . margins, dydx(*) predict(base(price=pt trunk=tt weight=wt length=lt))
    
    Average marginal effects                        Number of obs     =         69
    Model VCE    : Robust
    
    Expression   : Pr(foreign==Foreign), predict(base(price=pt trunk=tt weight=wt length=lt))
    dy/dx w.r.t. : mpg price trunk weight length
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             mpg |  -.0075061   .0072834    -1.03   0.303    -.0217813    .0067691
           price |  -.0002802   .0001887    -1.48   0.138      -.00065    .0000897
           trunk |   .0518648    .035152     1.48   0.140    -.0170319    .1207616
          weight |          0  (omitted)
          length |          0  (omitted)
    ------------------------------------------------------------------------------
    Here everything is conventional except for the -predict( base(...) )- option. Notice that I created copies of the variables in the model. It is to let -margins- know that it should not take derivatives with respect to the residuals from the reduced form equation. Again, as I mentioned in previous posts, we are going to make this process less cumbersome, but for now this is one of the ways to get a structural effect.

    Comment


    • #3
      Enrique Pinzon (StataCorp) Thanks a lot for this very useful reply! I just have a few questions regarding the example you suggested:
      1. Is it possible to add control variables in the IV equation? In the example you mentioned, do I just add the control variables after mpg weight/ mpg length in price/trunk equations respectively?
      2. Are the independent variables in the main equation added to the IV equation? In the example above, is mpg an independent variable in the main equation this is why it is used in the IV equations of price and trunk?
      3. Can I still add vce(cluster) in the eprobit regression?
      4. I was first thinking about creating 2 separate equations, one for the impact of migration on schooling and another one for the impact of remittances on schooling. If I instead use eprobit and estimate the impact of migration and remittances in the same equation like the example you suggested, will I get the same coefficients as estimating the impact of each variable in a separate equation? If not, should I then rely on ivprobit or cmp ivprobit?
      Thanks again for your support! I really appreciate it.
      Last edited by Salma Nooh; 31 Jul 2020, 15:30.

      Comment


      • #4
        Hello Salma,

        1. You can add control variables to the main equation or to the control equations. The extended regression framework is flexible in that way.
        2. -ivprobit- and -ivregress- add the independent variables to the -iv- equation, the -eprobit- command requires that you add all the instruments you want to the -iv- equations
        3. Yes, vce(cluster) is available
        4. With -eprobit- you get equivalent results to what you would get with -ivprobit- . One of the big differences is that with -eprobit- you may have different instruments for the endogenous variables. Below is an example of the equivalences between the two commands:

        Code:
        . sysuse auto, clear 
        (1978 Automobile Data)
        
        . quietly ivprobit foreign mpg (price= trunk)
        
        . estimates store ivprobit
        
        . quietly eprobit foreign mpg, endogenous(price = trunk mpg)
        
        . estimates store eprobit 
        
        . estimates table ivprobit eprobit
        
        ----------------------------------------
            Variable |  ivprobit     eprobit    
        -------------+--------------------------
        foreign      |
               price | -.00035278   -.00035277  
                 mpg |  -.0660362   -.06603027  
               _cons |   3.460684     3.460479  
        -------------+--------------------------
        price        |
                 mpg |  -220.1691   -220.16536  
               trunk |  43.548465    43.557384  
               _cons |  10255.177    10254.975  
        -------------+--------------------------
          /athrho2_1 |  2.4388651               
           /lnsigma2 |  7.8569283               
         var(e.price)|               6674844.6  
        corr(e.price,|
           e.foreign)|                  .98488  
        ----------------------------------------

        Comment


        • #5
          Thanks a lot for answering my questions! I really appreciate it

          Comment


          • #6
            Thanks a lot for answering my questions! I really appreciate it

            Comment

            Working...
            X