Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • scores prediction after using latest cmclogit v/s (old) clogit

    Dear Statalist,

    I have been working with the latest Discrete Choice Models commands released in Stata 16 and currently, I am having problems using option scores when using predict after cmclogit. To be concrete, what I am trying to replicate is the robust standard errors produced by cmclogit using developers function _robust. The problem is that the cmclogitis reporting scores for each regressor instead of the old version of clogit that performs the score function of the loglikelihood $\partial(LL)/\partial(xb)$. Below is an example of the replication of robust standard errors using clogit and _robust.

    The main question is, is there any what to, using
    cmclogit produce the score function that was produced by clogit or, equivalently, using scores from cmclogit to replicate robust standard errors using _robust?


    Below you may find an illustration of my question.

    1.-Data Generation: This step is just to generate data to be used by the example. Not totally really relevant to the question, but to provide a bit of a context, it emulates 100 decision-makers that face 3 alternatives that are entirely described by two attributes (x1,x2) but real parameters 0.5 and 2, respectively.


    Code:
    clear all
    set seed 157
    set obs 100
    gen id = _n
    local n_choices =3
    expand `n_choices'
    bys id : gen alternative = _n
    
    gen x1 =  runiform(-2,2)
    gen x2 =  runiform(-2,2)
    
    matrix betas = (0.5 ,2)
    global individuals = "id"
    
    mata: 
    // Calls from Stata the matrix "betas"
    betas =st_matrix("betas")
    // Generates a view of all attributes (M) x*
    st_view(X = ., ., "x*")
    // Generates a view individuals id 
    st_view(panvar = ., ., st_global("individuals"))
    // Extemely usefull utilities for panel data 
    paninfo = panelsetup(panvar, 1)     
    npanels = panelstats(paninfo)[1] 
       
    // Looping over individuals (n)
    for(n=1; n <= npanels; ++n) { 
        // Extract only the submatrix of individual n
        x_n = panelsubmatrix(X, n, paninfo) 
        
        // Linear utility
        util_n =betas :* x_n
        util_sum =rowsum(util_n) 
        U_exp = exp(util_sum)
        // Probability of each alternative
        p_i =     U_exp :/ colsum(U_exp)
        
        // Probability balance of alternatives
        cum_p_i =runningsum(p_i)    
        rand_draws = J(rows(x_n),1,uniform(1,1)) 
        pbb_balance = rand_draws:<cum_p_i
        cum_pbb_balance = runningsum(pbb_balance)
        choice_n = (cum_pbb_balance:== J(rows(x_n),1,1))
        
        // Storing each individual choice.
        if (n==1) Y =choice_n    
        else       Y = Y \ choice_n    
    }
    // Creates a new Stata variable called "choice"    
    idx = st_addvar("float", "choice")
    // save the content of Y on "choice" Stata variable
    st_store(., idx, Y)
    end
    
    list  id choice  x*  in  1/6 , sepby(id)
         +-------------------------------------+
         | id   choice          x1          x2 |
         |-------------------------------------|
      1. |  1        1    -1.74743    1.652354 |
      2. |  1        0    1.185718   -.3220893 |
      3. |  1        0   -.5604994   -1.191313 |
         |-------------------------------------|
      4. |  2        1    .1203582    1.924556 |
      5. |  2        0    .4377979    .5250021 |
      6. |  2        0   -1.135918   -.0941735 |
         +-------------------------------------+
    Code:
    
    



    2.- clogit + _robust:
    Below I fit a
    clogit and show how to recover robust standard errors using _robust developers' function. Also I show the score prediction obtained after using predict with option scores (s_clog).

    Code:
    . clogit choice  x* , gr(id) nolog
    
    . predict double s_clog , scores
    
    . list id choice  x* s_clog  in  1/8 , sepby(id)
    
         +--------------------------------------------------+
         | id   choice          x1          x2       s_clog |
         |--------------------------------------------------|
      1. |  1        1    -1.74743    1.652354     .0640498 |
      2. |  1        0    1.185718   -.3220893     -.059243 |
      3. |  1        0   -.5604994   -1.191313    -.0048068 |
         |--------------------------------------------------|
      4. |  2        1    .1203582    1.924556    .07091144 |
      5. |  2        0    .4377979    .5250021   -.06192256 |
      6. |  2        0   -1.135918   -.0941735   -.00898888 |
         +--------------------------------------------------+



    Now invoking _robust I replicate standard errors for clogit.

    Code:
    . // _robust 
    . matrix D = e(V)
    
    . matrix b = e(b)
    
    . _robust  s_clog  ,  v(D)  cluster(id)
    
    . ereturn post b D, depn(choice) 
    
    . ereturn display
    ------------------------------------------------------------------------------
          choice |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    choice       |
              x1 |    .426864   .1835416     2.33   0.020      .067129    .7865989
              x2 |    2.03195   .3242621     6.27   0.000     1.396407    2.667492
    ------------------------------------------------------------------------------
    
    . clogit choice  x* , gr(id) robust nolog
    
    Conditional (fixed-effects) logistic regression
    
                                                    Number of obs     =        300
                                                    Wald chi2(2)      =      40.85
                                                    Prob > chi2       =     0.0000
    Log pseudolikelihood = -50.698466               Pseudo R2         =     0.5385
    
                                         (Std. Err. adjusted for clustering on id)
    ------------------------------------------------------------------------------
                 |               Robust
          choice |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
              x1 |    .426864   .1835416     2.33   0.020      .067129    .7865989
              x2 |    2.03195   .3242621     6.27   0.000     1.396407    2.667492
    ------------------------------------------------------------------------------


    2.- cmclogit + _robust:

    Below I use
    cmclogit to fit the same model, but when performing the score prediction I got partial derivatives with respect to each attribute (s_cmc1 and s_cmc2).

    Code:
    . cmset id alternative
    
                caseid variable:  id
          alternatives variable:  alternative
    
    . cmclogit choice  x* , nocons nolog (NOT ROBUST!)
    
    Conditional logit choice model                 Number of obs      =        300
    Case ID variable: id                           Number of cases    =        100
    
    Alternatives variable: alternative             Alts per case: min =          3
                                                                  avg =        3.0
                                                                  max =          3
    
                                                      Wald chi2(2)    =      41.32
    Log likelihood = -50.698466                       Prob > chi2     =     0.0000
    
    ------------------------------------------------------------------------------
          choice |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    alternative  |
              x1 |    .426864   .1799369     2.37   0.018     .0741941    .7795338
              x2 |    2.03195   .3188147     6.37   0.000     1.407084    2.656815
    ------------------------------------------------------------------------------
    
    . predict double s_cmc* , scores
    
    . list  id choice  x* s_cmc*  in  1/6 , sepby(id)
    
         +---------------------------------------------------------------+
         | id   choice          x1          x2       s_cmc1       s_cmc2 |
         |---------------------------------------------------------------|
      1. |  1        1    -1.74743    1.652354   -.11192254    .10583297 |
      2. |  1        0    1.185718   -.3220893    -.0702455    .01908154 |
      3. |  1        0   -.5604994   -1.191313    .00269421     .0057264 |
         |---------------------------------------------------------------|
      4. |  2        1    .1203582    1.924556    .00853477    .13647304 |
      5. |  2        0    .4377979    .5250021   -.02710956   -.03250947 |
      6. |  2        0   -1.135918   -.0941735    .01021063    .00084651 |
         +---------------------------------------------------------------+


    Thank you a lot for your time.
    Álvaro

    Stata 16.1 MP
    Win10/Linux Mint 19.1
    https://alvarogutyerrez.github.io/

  • #2
    Álvaro, you might be confused regarding the following: When you have a single equation estimator, then you have only one score. When you have a system of equations estimator, then you get more than one score, in fact as many scores as equations you have.

    I do not have Stata 16, and I do not know anything about -cmclogit-, but the -clogit- is a single equation estimator. From looking at your code it seems to me that -cmclogit- is a multiple equations estimator.

    Comment


    • #3
      Dear Joro Kolev,

      Thanks for your answer. Indeed, I am quite confused, because both models fit the exact same model as you can see from the coefficients, standard errors, and loglikelihood. Hence, for me, it was quite surprising to find such differences when using -predict ,scores-for the exact same model specification.

      What you mention about score functions it's true, indeed, as it is specified in the -predict- documentation, the score option is computed it using the formulae $\partial LL / \partial X*B$, where LL is the loglikelihood. Unfortunately, what I am trying to compute is $\partial LL / \partial b$, for each parameter b in my model. Also, my example was illustrated with a linear-in-parameters Random Utility Maximization (RUM) model, but the problem that I am working with, and which have motivated this question, is not linear in parameters, therefore $\partial LL / \partial X_j'*B$ is meaningless in such a case.

      Thanks again for your time,
      Álvaro

      Stata 16.1 MP
      Win10/Linux Mint 19.1
      https://alvarogutyerrez.github.io/

      Comment


      • #4
        If I understood your original post correctly, your index function is linear in parameters: b1*x1 + b2*x2 + (...). This means that the -cmclogit- score w.r.t. b1 is the -clogit- score times x1; the -cmclogit- score w.r.t. b2 is the -clogit- score times x2 and so on.

        Let's assume that you have at least one regressor, say x1, which never takes a value of zero. To move from the -cmclogit- score to the -clogit- score, all you need to do is:

        predict double score_b*, score
        gen double score_xb = score_b1 / x1

        where score_b1 refers to the variable that stores the -cmclogit- score w.r.t. b1. The variable -score_xb- is the same as the score that you get after -clogit-, barring minor differences due to the loss of numerical precision.

        If all of your regressors take a value of zero in some observations, you can add a non-zero constant (say, 1) to any of them (say x1) before generating scores as follows:

        replace x1 = x1 + 1
        predict double score_b*, score
        gen double score_xb = score_b1 / x1

        Now in your follow-up post, you refer to a non-linear index model, and I think we need more information about your model specification before some of us can say anything useful.

        Since both -cmclogit- and -clogit- are intended for linear index models, I'm not sure how your original question is relevant to what you're looking for.

        I also find it hard to understand why $\partial LL / \partial X_j'*B$ is meaningless in the context of a non-linear index model. Ultimately, each "parameter" (i.e. equation) is linear in "coefficients", and $\partial LL / \partial X_j'*B$ is a score w.r.t. a "parameter" regardless of whether your index function is linear in that "parameter" or not. Here I use "parameter" and "coefficient" in the sense those terms are used in Stata. I think some key information is missing.
        Last edited by Hong Il Yoo; 25 Jul 2020, 06:25.

        Comment


        • #5
          Dear Hong II Yoo,

          thanks for your answer, you have no idea how useful your comments were. Then, to clarify, what I wanted to replicate was the robust standard errors of a conditional logit,using the partial derivatives of the loglikelihood w.r.t. each parameter (beta) in the fashion that - predict , scores - after -cmclogit- does. At first, I was trying to do it using _robust, but it wasn't suitable for my purposes, so I decided to do it using just simple Stata commands.

          Just for the sake of illustration, this is the robust variance-covariance matrix that I was trying to replicate which can be found in page 14 of _robust pdf documentation.
          Click image for larger version

Name:	score.JPG
Views:	1
Size:	15.6 KB
ID:	1565316

          The subsequent code does it using just simple Stata commands (it uses the simulated data from above.)

          Code:
          // Storing robust matrix from cmclogit 
          cmset id alternative
          cmclogit choice  x* , nocons nolog  robust 
          matrix V_r_cmclogit= e(V)
          
          // Replicating it using first principles 
          cmclogit choice  x* , nocons nolog 
          // Inverse of the hessian
          matrix D = e(V)
          // Scores w.r.t each attribute
          predict  double s_* ,scores
          // cluster adjustment (non-independent observations)
          bys id: egen double s_c_1 = total(s_1)
          bys id: egen double s_c_2 = total(s_2)
          // dropping unnecesary observations after collapses (inner sum_{j \in C_i})
          bysort id: keep if _n==1
          // outer product of the score functions
          qui corr s_c_1 s_c_2 ,covar
          matrix M = r(C)* 100 // 100 is the number of clusters
          // adjustment for number ob observations
          matrix V_r_manual = D*M*D
          // Manually replicated matrix 
          . mat li V_r_manual
          
          symmetric V_r_manual[2,2]
                           alternative:  alternative:
                                    x1            x2
          alternative:x1     .03368752
          alternative:x2     .03214068     .10514594
          
          
          . mat li  V_r_cmclogit
          
          symmetric V_r_cmclogit[2,2]
                           alternative:  alternative:
                                    x1            x2
          alternative:x1     .03368752
          alternative:x2     .03214068     .10514594
          What confuses me the most in my previous example was the syntaxis of _robust, which only uses the scores of the $\partial LL / \partial X_j'*B$, and then, multiplies internally by the covariates Xs as required, so when I was trying to do it using $\partial LL / \partial B$ I was basically including the X's twice.

          Regarding my follow-up post, I will elaborate a little bit more about it in the upcoming days, but for now, I think I can tackle my problem after having solved this issue.

          Thank you a lot for your time. Also thanks for your answer in my latter post, your suggestions about using -driv()- Mata function was very useful and I think I will post a self-contained example about the usage of it soon as a follow-up answer to this post.

          Best regards,

          Álvaro
          Stata 16.1 MP
          Win10/Linux Mint 19.1
          https://alvarogutyerrez.github.io/

          Comment


          • #6
            Thanks a lot for your kind words, glad to hear that I've said something useful

            Comment

            Working...
            X