Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • robust standard errors using Mata-based d0 ml evaluator using _robust developers' function

    Dear Statalist,

    I have a model fitted using ml using a d0 Mata-based evaluator. Now, I am trying to calculate the robust errors using _robust developers' function. Unfortunately, given that I am working using d0 family (d0 it is necessary because I am working with a special case of a discrete choice model), when I try to calculate the score function in order to create the white/sandwich matrix robust for heteroscedasticity I got the following message: scores cannot be produced with method d0. Of course, it does totally make sense because I am not providing the ml with the analytical derivatives to calculate the score functions.

    That being said, my question is if it's possible to recover the numerical approximations of the score functions that
    ml calculates in order to use them as an input for _robust command to calculate "numerical" robust standard errors? This is also tremendously relevant in my case because I am planning to include analytical derivatives onto my model shortly (turning it into a d1 ml model), but I also want to have a way to check if my analytical results (both algebraical computations and code implementation) matches the ml numerical approximations of the score functions.

    Here I listed a self-contained example using a conditional logit (old clogit still kicking!), using _robust command to match robust standard errors, which in this specific case, happens to be, the same thing as vce(cluster individual_var) (this is also shown below).

    Just for the sake of concreteness, I want to state that my original problem is not a conditional logit (otherwise this post will be meaningless), but I prefer to exemplify my problem using it because both shared several critical points.




    (1) Data Generation: I used Mata just because it's faster creating the data. The real betas are 0.5 and 2 for attributes x1 and x2, respectively as you can see from matrix "betas". Furthermore, 2000 individuals are facing 5 choice situations each.

    Code:
    clear all
    set seed 157
    set obs 2000
    gen id = _n
    local n_choices =5
    expand `n_choices'
    bys id : gen alternative = _n
    
    local n_atrib = 2
    forvalues i = 1/`n_atrib' {
        gen x`i' =  runiform(-2,2)
    }
    
    gen choice = . // needs to be created
    matrix betas = (0.5 ,2)
    global MY_panel = "id"
    
    
    mata: 
    betas =st_matrix("betas")
    st_view(X = ., ., "x*")
    st_view(panvar = ., ., st_global("MY_panel"))
    npanels = max(panvar)
    X_id =X,panvar
    
    for(n=1; n <= npanels; ++n) { 
        x_n = select(X, (X_id[.,cols(X_id) ]:==n)) // selects X's for each individual n
        util_n =betas :* x_n
        util_sum =rowsum(util_n) 
        U_exp = exp(util_sum)
        p_i =     U_exp :/ colsum(U_exp)
        cum_p_i =runningsum(p_i)    
        rand_draws = J(rows(x_n),1,uniform(1,1)) 
        pbb_balance = rand_draws:<cum_p_i
        cum_pbb_balance = runningsum(pbb_balance)
        choice_n = (cum_pbb_balance:== J(rows(x_n),1,1))
        if (n==1) Y =choice_n    
        else       Y = Y \ choice_n    
    }
            
    new_data=.
    st_view(new_data, ., "x* choice ")
    new_data[.,cols(new_data)] = Y[.,1]
    
    end


    (2) clogit + _robust combo: Here, I show that, first of all, the DGP was correctly captured by clogit (given the obtained betas). Also, how robust standard error (third column) is equivalent to use cluster errors across individuals (fourth column), and even most importantly, how _robust developers function can be used to construct robust errors in clogit, (second column) which is my main goal with my own problem.

    Code:
    /* Robust/Cluster per ind */
    
    // clogit: No robust
    eststo  no_r :clogit choice x*  , group(id) nolog
    // _robust
    matrix D = e(V)
    matrix b = e(b)
    predict double _score , score
    _robust _score , v(D) cluster(id) 
    ereturn post b D, depn(choice) 
    eststo  r_robust :ereturn display
    // clogit: robust
    eststo  robust: clogit choice x*, group(id) vce(robust) nolog
    // clogit: cluster
    eststo  cluster: clogit choice x*  , group(id) vce(cluster id) nolog
    // All models
    esttab  , label    b(a4) se   ///
         nonumbers mtitles("no_r" "r_robust"  "vce(robust)" "vce(cluster)" )  ///
          replace
    ------------------------------------------------------------------------------------
                                 no_r        r_robust     vce(robust)    vce(cluster)   
    ------------------------------------------------------------------------------------
    choice                                                                              
    x1                         0.5137***       0.5137***       0.5137***       0.5137***
                            (0.03190)       (0.03158)       (0.03158)       (0.03158)   
    
    x2                         1.9802***       1.9802***       1.9802***       1.9802***
                            (0.05672)       (0.05538)       (0.05538)       (0.05538)   
    ------------------------------------------------------------------------------------
    Observations                10000                           10000           10000   
    ------------------------------------------------------------------------------------



    (3) simple ml replication of clogit: Finally, I replicate the clogit results using a Mata based d0 evaluator, showing how standard errors match the first column of my last table (clogit with no standard error corrections)

    Code:
    mata:
        void RRM_d0(transmorphic scalar M, real scalar todo,
        real rowvector b, real scalar lnf,
        real rowvector g, real matrix H)
     {
        /*-----------------------------*/
        /*----variables declaration----*/
        /*-----------------------------*/
        
        real matrix panvar  
        real matrix paninfo 
        real scalar npanels
    
        real scalar n 
        real matrix Y 
        real matrix X 
        real matrix x_n 
        real matrix y_n 
    
        // variables creation
        Y = moptimize_util_depvar(M, 1)     
        X= moptimize_init_eq_indepvars(M,1)    
    
        st_view(panvar = ., ., st_global("MY_panel"))
        paninfo = panelsetup(panvar, 1)     
        npanels = panelstats(paninfo)[1] 
        lnfj = J(npanels, 1, 0) 
    
        for(n=1; n <= npanels; ++n) {
            x_n = panelsubmatrix(X, n, paninfo) 
            y_n = panelsubmatrix(Y, n, paninfo) 
            // Linear utility        
            U = rowsum(b:* x_n) 
            // exp() Utility
            U_exp = exp(U)
            // Probability 
            p_i = colsum((U_exp :* y_n))  / colsum(U_exp)
            // log probability
            lnfj[n] = ln(p_i)
             }
        // Sum over individuals (n)     
        lnf = moptimize_util_sum(M, lnfj)
    }
    end
    
    global MY_panel id  // i is the identification variable per individuals
    sort $MY_panel
    ml model d0 RRM_d0() (clogit:  choice = x*, nocons) 
    ml maximize ,nolog
    
    . ml maximize ,nolog
    
                                                    Number of obs     =     10,000
                                                    Wald chi2(2)      =    1250.08
    Log likelihood = -1582.7261                     Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
          choice |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
              x1 |   .5137069   .0319027    16.10   0.000     .4511788     .576235
              x2 |   1.980155   .0567223    34.91   0.000     1.868981    2.091328
    ------------------------------------------------------------------------------



    (4) Finally, the error...: Here is what was my original plan (where I get the error stated at the beginning).

    Code:
    matrix D = e(V)
    matrix b = e(b)
    predict double _score , score // here is where it fails.
    _robust _score , v(D) cluster(id)
    ereturn post b D, depn(choice) 
    ereturn display


    I do really hope this (longer than I expected) example could help to clarify what I am trying to do and where it failed specifically.
    Thanks for reading
    up to this point.
    Sincerely,

    Álvaro
    Stata 16.1 MP
    Win10

  • #2
    I encountered a similar sort of error when I tried to compute clustered standard errors (in fact, usual sandwich standard errors too) for my -gf0- evaluator. Any attempt to execute -ml (...), vce(robust) maximize- and -ml (...), vce(cluster clusterid) maximize- resulted in an error message saying something to the effect that Stata couldn't compute scores. The fix I've found is that I have to add the following command line to my evaluator:

    moptimize_init_by(M,clusterid)

    where -clusterid- is a column vector that refers to my cluster ID variable. Of course, prior to this command line, I have another command line that creates -clusterid- in Mata.
    Last edited by Hong Il Yoo; 23 Jun 2020, 06:29.

    Comment


    • #3
      Thanks for your answer Hong II Yo.
      I think that might be a solution, but the only problem is that, as far as I know, it is not working particularly in the context of d0 evaluators.
      After trying your solution I got what I expected which is the following error: option vce(robust) is not allowed with evaltype d0
      This is the reason why I initially started to use _robust developers' function to sort out the fact that d0 evaluators are not meant to work with robust nor cluster standard errors. This is because the d-family evaluators do not use observation-level scores as stated at Gould, W., Pitblado, J., & Sribney, W. (2006). Maximum likelihood estimation with Stata.

      Sincerely,
      Álvaro
      Stata 16.1 MP
      Win10

      Comment


      • #4
        I see, in that case, I'm wondering if you may find it more convenient to work with -gf0-, -gf1- and -gf2- instead of -d0-, -d1- and -d2-. At least when it comes to porting your code from -d0- to -gf0-, the change that you will have to make is minimal.

        Alternatively, I remember having come across -deriv()- function in the Mata manual, though I haven't used it myself.
        Last edited by Hong Il Yoo; 23 Jun 2020, 16:42.

        Comment

        Working...
        X