Dear Statalist,
I have been working with the latest Discrete Choice Models commands released in Stata 16 and currently, I am having problems using option scores when using predict after cmclogit. To be concrete, what I am trying to replicate is the robust standard errors produced by cmclogit using developers function _robust. The problem is that the cmclogitis reporting scores for each regressor instead of the old version of clogit that performs the score function of the loglikelihood $\partial(LL)/\partial(xb)$. Below is an example of the replication of robust standard errors using clogit and _robust.
The main question is, is there any what to, using cmclogit produce the score function that was produced by clogit or, equivalently, using scores from cmclogit to replicate robust standard errors using _robust?
Below you may find an illustration of my question.
1.-Data Generation: This step is just to generate data to be used by the example. Not totally really relevant to the question, but to provide a bit of a context, it emulates 100 decision-makers that face 3 alternatives that are entirely described by two attributes (x1,x2) but real parameters 0.5 and 2, respectively.
2.- clogit + _robust:
Below I fit a clogit and show how to recover robust standard errors using _robust developers' function. Also I show the score prediction obtained after using predict with option scores (s_clog).
Now invoking _robust I replicate standard errors for clogit.
2.- cmclogit + _robust:
Below I use cmclogit to fit the same model, but when performing the score prediction I got partial derivatives with respect to each attribute (s_cmc1 and s_cmc2).
Thank you a lot for your time.
Álvaro
Stata 16.1 MP
Win10/Linux Mint 19.1
https://alvarogutyerrez.github.io/
I have been working with the latest Discrete Choice Models commands released in Stata 16 and currently, I am having problems using option scores when using predict after cmclogit. To be concrete, what I am trying to replicate is the robust standard errors produced by cmclogit using developers function _robust. The problem is that the cmclogitis reporting scores for each regressor instead of the old version of clogit that performs the score function of the loglikelihood $\partial(LL)/\partial(xb)$. Below is an example of the replication of robust standard errors using clogit and _robust.
The main question is, is there any what to, using cmclogit produce the score function that was produced by clogit or, equivalently, using scores from cmclogit to replicate robust standard errors using _robust?
Below you may find an illustration of my question.
1.-Data Generation: This step is just to generate data to be used by the example. Not totally really relevant to the question, but to provide a bit of a context, it emulates 100 decision-makers that face 3 alternatives that are entirely described by two attributes (x1,x2) but real parameters 0.5 and 2, respectively.
Code:
clear all set seed 157 set obs 100 gen id = _n local n_choices =3 expand `n_choices' bys id : gen alternative = _n gen x1 = runiform(-2,2) gen x2 = runiform(-2,2) matrix betas = (0.5 ,2) global individuals = "id" mata: // Calls from Stata the matrix "betas" betas =st_matrix("betas") // Generates a view of all attributes (M) x* st_view(X = ., ., "x*") // Generates a view individuals id st_view(panvar = ., ., st_global("individuals")) // Extemely usefull utilities for panel data paninfo = panelsetup(panvar, 1) npanels = panelstats(paninfo)[1] // Looping over individuals (n) for(n=1; n <= npanels; ++n) { // Extract only the submatrix of individual n x_n = panelsubmatrix(X, n, paninfo) // Linear utility util_n =betas :* x_n util_sum =rowsum(util_n) U_exp = exp(util_sum) // Probability of each alternative p_i = U_exp :/ colsum(U_exp) // Probability balance of alternatives cum_p_i =runningsum(p_i) rand_draws = J(rows(x_n),1,uniform(1,1)) pbb_balance = rand_draws:<cum_p_i cum_pbb_balance = runningsum(pbb_balance) choice_n = (cum_pbb_balance:== J(rows(x_n),1,1)) // Storing each individual choice. if (n==1) Y =choice_n else Y = Y \ choice_n } // Creates a new Stata variable called "choice" idx = st_addvar("float", "choice") // save the content of Y on "choice" Stata variable st_store(., idx, Y) end list id choice x* in 1/6 , sepby(id) +-------------------------------------+ | id choice x1 x2 | |-------------------------------------| 1. | 1 1 -1.74743 1.652354 | 2. | 1 0 1.185718 -.3220893 | 3. | 1 0 -.5604994 -1.191313 | |-------------------------------------| 4. | 2 1 .1203582 1.924556 | 5. | 2 0 .4377979 .5250021 | 6. | 2 0 -1.135918 -.0941735 | +-------------------------------------+
Code:
2.- clogit + _robust:
Below I fit a clogit and show how to recover robust standard errors using _robust developers' function. Also I show the score prediction obtained after using predict with option scores (s_clog).
Code:
. clogit choice x* , gr(id) nolog . predict double s_clog , scores . list id choice x* s_clog in 1/8 , sepby(id) +--------------------------------------------------+ | id choice x1 x2 s_clog | |--------------------------------------------------| 1. | 1 1 -1.74743 1.652354 .0640498 | 2. | 1 0 1.185718 -.3220893 -.059243 | 3. | 1 0 -.5604994 -1.191313 -.0048068 | |--------------------------------------------------| 4. | 2 1 .1203582 1.924556 .07091144 | 5. | 2 0 .4377979 .5250021 -.06192256 | 6. | 2 0 -1.135918 -.0941735 -.00898888 | +--------------------------------------------------+
Now invoking _robust I replicate standard errors for clogit.
Code:
. // _robust
. matrix D = e(V)
. matrix b = e(b)
. _robust s_clog , v(D) cluster(id)
. ereturn post b D, depn(choice)
. ereturn display
------------------------------------------------------------------------------
choice | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
choice |
x1 | .426864 .1835416 2.33 0.020 .067129 .7865989
x2 | 2.03195 .3242621 6.27 0.000 1.396407 2.667492
------------------------------------------------------------------------------
. clogit choice x* , gr(id) robust nolog
Conditional (fixed-effects) logistic regression
Number of obs = 300
Wald chi2(2) = 40.85
Prob > chi2 = 0.0000
Log pseudolikelihood = -50.698466 Pseudo R2 = 0.5385
(Std. Err. adjusted for clustering on id)
------------------------------------------------------------------------------
| Robust
choice | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x1 | .426864 .1835416 2.33 0.020 .067129 .7865989
x2 | 2.03195 .3242621 6.27 0.000 1.396407 2.667492
------------------------------------------------------------------------------
2.- cmclogit + _robust:
Below I use cmclogit to fit the same model, but when performing the score prediction I got partial derivatives with respect to each attribute (s_cmc1 and s_cmc2).
Code:
. cmset id alternative
caseid variable: id
alternatives variable: alternative
. cmclogit choice x* , nocons nolog (NOT ROBUST!)
Conditional logit choice model Number of obs = 300
Case ID variable: id Number of cases = 100
Alternatives variable: alternative Alts per case: min = 3
avg = 3.0
max = 3
Wald chi2(2) = 41.32
Log likelihood = -50.698466 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
choice | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
alternative |
x1 | .426864 .1799369 2.37 0.018 .0741941 .7795338
x2 | 2.03195 .3188147 6.37 0.000 1.407084 2.656815
------------------------------------------------------------------------------
. predict double s_cmc* , scores
. list id choice x* s_cmc* in 1/6 , sepby(id)
+---------------------------------------------------------------+
| id choice x1 x2 s_cmc1 s_cmc2 |
|---------------------------------------------------------------|
1. | 1 1 -1.74743 1.652354 -.11192254 .10583297 |
2. | 1 0 1.185718 -.3220893 -.0702455 .01908154 |
3. | 1 0 -.5604994 -1.191313 .00269421 .0057264 |
|---------------------------------------------------------------|
4. | 2 1 .1203582 1.924556 .00853477 .13647304 |
5. | 2 0 .4377979 .5250021 -.02710956 -.03250947 |
6. | 2 0 -1.135918 -.0941735 .01021063 .00084651 |
+---------------------------------------------------------------+
Thank you a lot for your time.
Álvaro
Stata 16.1 MP
Win10/Linux Mint 19.1
https://alvarogutyerrez.github.io/
Comment