Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • st_store() and stata() when using moptimize(): IV qreg

    Hi, I have a Mata function that creates a column vector (called yd below), passes it to Stata as a tempvar, and then calls a Stata package using said tempvar. (If you're curious, this is for instrumental variable quantile regression, as suggested by Chernozhukov and Hansen (2008) "Instrumental variable quantile regression: A robust inference approach" J. of Econometrics.)

    Code:
    mata: ok=st_local("touse")
    
    void objFun(M,todo,b,crit,s,H)
    {
        real scalar tau
        real matrix y,x,d,z            /* structure elements */
        string scalar dep,instr,reg,ok,quant
        struct ivqregInfo scalar G
        
        real matrix yd,beta,var
            
        y=moptimize_util_depvar(M,1)
        G=moptimize_util_userinfo(M,1)
        
        // new Y
        yd=y :- G.d*b'
        
        // pass to Stata
        (void) st_addvar("double", newY=st_tempname())
        st_store(.,newY,ok, yd)
    
        
        stata("qui qreg "+newY+" "+G.instr+" "+G.reg+" if "+G.ok+", q("+G.quant+") ")
        
        beta=st_matrix("e(b)")[1,1]
        var=st_matrix("e(V)")[1,1]
        
        crit=beta^2/var
    }
    end
    I get the following error: "st_store(): 3200 conformability error", which I understand means that Mata is passing a vector of the incorrect size.

    However, I can get this code snippet to run just fine outside of the Mata function:

    Code:
    mata: 
        
        yd=y :- d*100
        (void) st_addvar("double", newY=st_tempname())
        st_store(.,newY,ok, yd)
        stata("mean "+newY+" ")
        
    end
    It seems the error has to do with the selectvar within st_store() because I can get the entire routine to work if I first "keep if selectvar==1". For example, this works:

    Code:
        keep if sex==1
        mata: moptimize(M)    
        mata: moptimize_result_coefs(M)
    Any advice on what I may be doing incorrectly? I didn't want to overload with information. I can post the entire code if needed.

  • #2
    Hello Travis,

    If I understood your code correctly, the error might be produced because you are passing back to Stata (using st_store(.,newY,ok, yd)) a vector that does not match the dimension of your data set (number of rows v.s number of obs in Stata)

    If you post the entire code for replication or a self-contained example it could be easier to help you


    Best!


    Álvaro
    Stata 16.1 MP
    Win10/Linux Mint 19.1
    https://alvarogutyerrez.github.io/
    Last edited by Alvaro Gutierrez; 11 Aug 2020, 07:47.

    Comment


    • #3
      Thank you, Alvaro. Here's a longer explanation.

      The code is for instrumental variable quantile regression, as suggested by Chernozhukov and Hansen (2008) "Instrumental variable quantile regression: A robust inference approach". Their original suggestion is to use a grid search, which works fine (see first chunk of code below). But I would like to use Nelder-Mead to optimize the objective function.

      Here's the idea. You have an outcome Y, endogenous variable D, instrument Z and some covariates X. The goal is to find the optimal coefficient (beta) on D. The grid search algorithm (for a single endog variable and instrument) is:

      1. Guess a beta, and create a new outcome newY = (Y - D*beta)
      2. Run ordinary quantile of newY on Z and X.
      3. Let alpha be the coef on Z. Create the Wald statistic: alpha^2/variance(alpha)
      4. The optimal beta minimizes the Wald stat (i.e., drives alpha to zero).

      Here is code those does the grid search and can recreate Fig 5 of Chernozhukov and Hansen at the 85th quantile:

      Code:
      clear all
      
      * program to calc Wald stat
      capture program drop ivqrw
      program define ivqrw, eclass
      
          syntax varlist [if] [in] ,     ///
              [GUESS(real 0)                    ///
                  INSTRuments(varlist)  ///
                  Quantile(real 0.5)]
              
              
          marksample touse
          gettoken lhs rhs : varlist
          markout `touse' `lhs' `rhs'
          
          // Get D, X and Z
          tempname Z X D
          local Z : list instruments - rhs
          local X : list instruments - excluded_inst
          local D : list rhs - instruments
          
          
          tempvar newY
          qui gen `newY' = `lhs' - `guess'*`D' if `touse'
          
          qui qreg `newY' `Z' `X' if `touse', q(`quantile') 
          
          tempname W
          sca `W'=(_b[`Z']^2)/(_se[`Z']^2)
          
          ereturn scalar W=`W'
          
      end
      
      
      
      use "http://fmwww.bc.edu/repec/bocode/j/jtpa.dta", clear
      
      global Y earnings
      global X hsorged black hispanic married wkless13 class_tr ojt_jsa age2225 age2629 age3035 age3644 age4554 f2sms
      global Z assignmt
      global D training
      
      
      * CH use the projection of D on Z and X as an IV
      reg $D $Z $X
      predict dhat if sex==1, xb
      global Z dhat
      
      
      foreach gr of numlist -2500(100)7500 {
          di `gr'
          mat grid_index=nullmat(grid_index) \ (`gr')
          qui ivqrw $Y $D $X if sex==1, instr($Z $X) q(.85) guess(`gr')
          mat W=nullmat(W) \ e(W)
      }
      
      svmat grid_index
      svmat W
      
      sca chi05 =invchi2tail(1,0.05)
      
      line W grid_index, yline(`=chi05') ///
          xlabel(-2000(2000)8000) ylabel(0(5)20)
      
      sum W
      list grid_index if W==r(min)
      Grid search says the answer is alpha=3200 (same as the paper). The horizontal line is the 95% critical value from a chi-squared distribution (df=1).

      Here is my (not so good) attempt to do this with moptimize().

      Code:
      mata:
      struct ivqregInfo {
          real matrix y,x,d,z
          real scalar tau
          string scalar dep,instr,reg,ok,quant 
      } 
      struct ivqregInfo ivqregInfoInit(y,x,d,z,tau,dep,instr,reg,ok,quant)
      {
          struct ivqregInfo scalar G
          G.y=y
          G.x=x
          G.d=d
          G.z=z
          G.tau=tau
          
          G.dep=dep            /* string variable for Y */
          G.instr=instr        /* string variable for Z */
          G.reg=reg            /* string variable for X */
          G.ok=ok                /* string variable for touse */    
          G.quant=quant        /* string variable for quantile */    
          return(G)
      }
      
      void objFun(M,todo,b,crit,s,H)
      {
          real scalar tau
          real matrix y,x,d,z            /* structure elements */
          string scalar dep,instr,reg,ok,quant
          struct ivqregInfo scalar G
          
          real matrix yd,beta,var
              
          y=moptimize_util_depvar(M,1)
          G=moptimize_util_userinfo(M,1)
          
          // new Y
          yd=y :- G.d*b'
          
          // pass to Stata
          (void) st_addvar("double", newY=st_tempname())
          st_store(.,newY,ok, yd)
      
          
          stata("qui qreg "+newY+" "+G.instr+" "+G.reg+" if "+G.ok+", q("+G.quant+") ")
          
          beta=st_matrix("e(b)")[1,1]
          var=st_matrix("e(V)")[1,1]
          
          crit=beta^2/var
      }
      end
      
      local tau=.85
      global sex sex
      
      qreg $Y $Z $X if sex==1, q(`tau') 
      
      
      * start with ordinary qreg beta
      mata: beta_start=st_matrix("e(b)")[1,1]
      * use the s.e. for the step in Nelder-Mead
      mata: step=sqrt(st_matrix("e(V)")[1,1])
      
      
          /* Read in info */
          mata: tau     =strtoreal(st_local("tau"))
          mata: st_view(d=.,.,"$D","sex")
          mata: st_view(z=.,.,"$Z","sex")
          mata: st_view(x=.,.,"$X","sex")
          mata: st_view(y=.,.,"$Y","sex")
          
          /* read in string variables for stata() funciton in mata */
          mata: dep=st_global("Y")
          mata: instr=st_global("Z")
          mata: reg=st_global("X")
          mata: ok=st_global("sex")
          mata: quant=st_local("tau")
          
          // setup moptimize
          mata: M=moptimize_init()
          mata: moptimize_init_evaluator(M,&objFun())
          mata: moptimize_init_touse(M,"sex")    
          mata: moptimize_init_evaluatortype(M,"d0")
          mata: moptimize_init_technique(M,"nm")
          mata: moptimize_init_depvar(M,1,y)
          
          // D is the only var with a parameter
          mata: moptimize_init_eq_indepvars(M,1,d)
          /// .... with no constant
          mata: moptimize_init_eq_cons(M,1, "off")
          mata: moptimize_init_eq_coefs(M,1, beta_start)
           mata: moptimize_init_which(M, "min")
          mata: moptimize_init_nmsimplexdeltas(M,J(1,1,step))
          mata: G=ivqregInfoInit(y,x,d,z,tau,dep,instr,reg,ok,quant)
          mata: moptimize_init_userinfo(M,1,G)
          
          mata: moptimize(M)    
          mata: moptimize_result_coefs(M)
      This is where I get the error "st_store(): 3200 conformability error". I think something is wrong with the way I code the selectvar within the Mata function, because this little code snippet works outside a Mata function (picking beta=100 for illustration purposes):

      Code:
      mata: 
          
          yd=y :- d*100
          (void) st_addvar("double", newY=st_tempname())
          st_store(.,newY,ok, yd)
          stata("mean "+newY+" ")
          
      end
      Moreover, (perhaps another clue) if I only keep the "touse" sample (here, males where sex=1), the routine works just fine:

      Code:
          keep if sex==1
          mata: moptimize(M)    
          mata: moptimize_result_coefs(M)
      In fact, Nelder-Mead finds an optimum of 3105, which is pretty close to 3200.

      Comment


      • #4
        Not sure why exactly your code fails.

        An alternative approach would be to generate the new variable in Stata first and then use a Mata view to fill it with the new data:
        Code:
        tempvar newY
        gen double `newY' = .
        mata:
            st_view(newY=.,.,"`newY'","sex")
            newY[.,.] = y :- d*100
        end
        https://twitter.com/Kripfganz

        Comment


        • #5
          Thanks, Sebastian. Yes, that works outside of the Mata function objFun().

          Within the function objFun(), however, I want to pass the Mata vector newY to Stata in order to run ordinary quantile regression within Stata.

          Basically, it seems like Mata is not recognizing my selectvar once inside the function objFun(), perhaps in the context of moptimze(). I'll try to write a separate Mata function to point to outside of objFun() and see if that works.

          Comment


          • #6
            I did run your code and I think I will stick to my first theory attributing the error to the difference between the length of the vector `newY' and the dataset.
            When I run a mata: mata describe after the error you mentioned. I see that the length of y is equal to 5102, and then with the st_store() you are trying to fit it into your dataset which is matrix with 11,204 rows.
            Code:
                   41,080   real matrix                 x[5102,13]
                   40,984   real colvector              y[5102]
                   40,984   real colvector              z[5102]
            The solution proposed by Sebastian Kripfganz seems reasonable for me, IF, the problem is related to the dimensions of the vectors, but I might be wrong.

            Best!


            Álvaro
            Stata 16.1 MP
            Win10/Linux Mint 19.1
            https://alvarogutyerrez.github.io/

            Comment


            • #7
              Oh geez...I found the error. I didn't put the "G" in front of the "ok"...dang structures. Code works now with the below edit:

              Code:
               
               st_store(.,newY,G.ok, yd)

              Comment


              • #8
                Added in edit: Crossed with post #7. There's still a lesson here: if Stata complains about something, be sure you understand what it is seeing.

                When I precede your st_store() with
                Code:
                    "newY is >>> " + newY + " <<<"
                    "ok is >>> " + ok + " <<<"
                and run your code I get the following as the final output:
                Code:
                .     mata: moptimize(M)    
                  newY is >>> __000001 <<<
                  ok is >>>  <<<
                              st_store():  3200  conformability error
                                objFun():     -  function returned error
                      mopt__calluser_d():     -  function returned error
                       opt__eval_nm_d0():     -  function returned error
                             opt__eval():     -  function returned error
                    _optimize_evaluate():     -  function returned error
                       _mopt__evaluate():     -  function returned error
                   _moptimize_evaluate():     -  function returned error
                     _moptimize_search():     -  function returned error
                      moptimize_search():     -  function returned error
                             moptimize():     -  function returned error
                                 <istmt>:     -  function returned error
                r(3200);
                
                end of do-file
                
                r(3200);
                
                .
                which strongly suggests that at the time st_store is run, the third argument, which is supposed to be a string scalar with a Stata variable name, is instead null.

                When I replace your st_store() with
                Code:
                    st_store(.,newY,"sex", yd)
                the error does not occur.

                Basically, it seems like Mata is not recognizing my selectvar once inside the function objFun(), perhaps in the context of moptimze().
                It seems to me as if your string scalar ok is not being given a value properly, or something is overwriting it.
                Last edited by William Lisowski; 11 Aug 2020, 10:41.

                Comment

                Working...
                X