Clustered Mlogit missing Wald Chi

Sindra Sharma

Join Date: Apr 2015

Posts: 4
#1

Clustered Mlogit missing Wald Chi

02 Dec 2015, 17:05

Dear Statalisters,
My name is: Sindra Sharma

This is my first question on Statalist so apologies if there are errors in the form.

I am running a mlogit vce(cluster).

My sample size is small (205). I am trying to cluster my SE at the village level (8 clusters). I am finding when I add vce(cluster) I do not get a Wald Chi2 value. Why would this be? This problem does not occur without the cluster command.

Thanks in advance.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30155
#2

02 Dec 2015, 17:22

There are several possibilities, and had you shown us the exact command you gave and the exact output that Stata gave you in response, we could probably be more helpful.

But let me make a guess. You have only 8 clusters. So if you have 6 or more variables in your model (and if you have categorical variables represented by multiple indicators, then each indicator counts as a separate variable), you have exhausted your degrees of freedom and there will be no overall model test. (Depending on how many levels your outcome variable has, the maximum number of variables you can include before exhausting your degrees of freedom can be even smaller.)

That said, you will still see Wald tests of the individual variables in the regression table output, and you can test various combinations of the variables using the -test- command.

If that isn't your problem, then please copy your command and Stata's output from the Stata Results window and paste them directly into a code block here on the forum. (See FAQ for how to create a code block.) You will get more specific advice then.

Also, the norm in this community is to use our real first and last names for our user names. So, please use the Contact Us button to ask the Forum administrator to change your user name to Sindra Sharma. Thanks!
Comment
Sindra Sharma

Join Date: Apr 2015

Posts: 4
#3

08 Dec 2015, 07:01

Dear Clyde,
Thanks for your reply. It was extremely useful.

Also please except my apologies for responding so late, I had thought the question had gone unanswered. Once again, thanks for your help.
Comment
Thomas Kean

Join Date: Mar 2021

Posts: 9
#4

12 Jun 2022, 09:23

Hello Sindra and Clyde,

i have i similar issue as i get a missing value for Chi2 (Stata/SE 17.0). I have made two regressions, Regression 1 where i get a Chi2 value and where i can check whether my model is statistically significant. On the other hand, when i regress with the vce(cluster variable) option (Regression 2) i get a missing value for chi2 and i cannot check whether my model is statistically significant. However, it seems as Regression 2 is the better model as more variables are significant.

My question is as follows: How do i know which model is better?

Thank you very much for your help!

logit TERMINATED CSRRQ FRQ_111_w i.HOSTILE i.MULTIBID TOEHOLD i.TENDER i.INDUSTRY i.STOCK SIZE_w sdCFO_w GROWTH_w RoE_w MTB_Refi
> nitiv_w LEVERAGE_w

Iteration 0: log likelihood = -63.815899
Iteration 1: log likelihood = -33.030832
Iteration 2: log likelihood = -31.019907
Iteration 3: log likelihood = -30.875754
Iteration 4: log likelihood = -30.875427
Iteration 5: log likelihood = -30.875427

Logistic regression Number of obs = 101
LR chi2(14) = 65.88
Prob > chi2 = 0.0000
Log likelihood = -30.875427 Pseudo R2 = 0.5162

---------------------------------------------------------------------------------
TERMINATED | Coefficient Std. err. z P>|z| [95% conf. interval]
----------------+----------------------------------------------------------------
CSRRQ | -.2842261 .4313439 -0.66 0.510 -1.129645 .5611925
FRQ_111_w | -6.437695 2.955846 -2.18 0.029 -12.23105 -.6443431
1.HOSTILE | 4.362367 .8989699 4.85 0.000 2.600419 6.124316
1.MULTIBID | 1.363963 .9316441 1.46 0.143 -.4620259 3.189952
TOEHOLD | -.0056767 .0302713 -0.19 0.851 -.0650075 .053654
1.TENDER | -1.955211 1.28334 -1.52 0.128 -4.470511 .560089
1.INDUSTRY | -.9345645 1.039687 -0.90 0.369 -2.972314 1.103185
1.STOCK | .4648789 .9889564 0.47 0.638 -1.47344 2.403198
SIZE_w | .2414528 .3883149 0.62 0.534 -.5196304 1.002536
sdCFO_w | -.0005694 .0040501 -0.14 0.888 -.0085074 .0073686
GROWTH_w | .7304614 .5897109 1.24 0.215 -.4253508 1.886274
RoE_w | -1.01383 1.041253 -0.97 0.330 -3.054649 1.026988
MTB_Refinitiv_w | -.0839447 .1699159 -0.49 0.621 -.4169738 .2490845
LEVERAGE_w | .3295281 .3912153 0.84 0.400 -.4372398 1.096296
_cons | -7.899894 8.08393 -0.98 0.328 -23.74411 7.944318
---------------------------------------------------------------------------------

. logit TERMINATED CSRRQ FRQ_111_w i.HOSTILE i.MULTIBID TOEHOLD i.TENDER i.INDUSTRY i.STOCK SIZE_w sdCFO_w GROWTH_w RoE_w MTB_Refi
> nitiv_w LEVERAGE_w, vce(cluster YEAR)

Iteration 0: log pseudolikelihood = -63.815899
Iteration 1: log pseudolikelihood = -33.030832
Iteration 2: log pseudolikelihood = -31.019907
Iteration 3: log pseudolikelihood = -30.875754
Iteration 4: log pseudolikelihood = -30.875427
Iteration 5: log pseudolikelihood = -30.875427

Logistic regression Number of obs = 101
Wald chi2(4) = .
Prob > chi2 = .
Log pseudolikelihood = -30.875427 Pseudo R2 = 0.5162

(Std. err. adjusted for 6 clusters in YEAR)
---------------------------------------------------------------------------------
| Robust
TERMINATED | Coefficient std. err. z P>|z| [95% conf. interval]
----------------+----------------------------------------------------------------
CSRRQ | -.2842261 .5091825 -0.56 0.577 -1.282205 .7137533
FRQ_111_w | -6.437695 3.720289 -1.73 0.084 -13.72933 .8539377
1.HOSTILE | 4.362367 .8443637 5.17 0.000 2.707445 6.01729
1.MULTIBID | 1.363963 .9642846 1.41 0.157 -.5260001 3.253926
TOEHOLD | -.0056767 .0153461 -0.37 0.711 -.0357546 .0244011
1.TENDER | -1.955211 1.163918 -1.68 0.093 -4.236449 .3260267
1.INDUSTRY | -.9345645 1.173526 -0.80 0.426 -3.234634 1.365505
1.STOCK | .4648789 .6301557 0.74 0.461 -.7702034 1.699961
SIZE_w | .2414528 .4762599 0.51 0.612 -.6919995 1.174905
sdCFO_w | -.0005694 .0037698 -0.15 0.880 -.0079581 .0068193
GROWTH_w | .7304614 .3415955 2.14 0.032 .0609466 1.399976
RoE_w | -1.01383 .5258354 -1.93 0.054 -2.044449 .0167879
MTB_Refinitiv_w | -.0839447 .1709889 -0.49 0.623 -.4190767 .2511874
LEVERAGE_w | .3295281 .1735415 1.90 0.058 -.010607 .6696632
_cons | -7.899894 9.893131 -0.80 0.425 -27.29007 11.49029
---------------------------------------------------------------------------------
Comment
Thomas Kean

Join Date: Mar 2021

Posts: 9
#5

12 Jun 2022, 11:20

The first screenshot is regression 1 and the second screenshot is regression 2.
Comment
Thomas Kean

Join Date: Mar 2021

Posts: 9
#6

12 Jun 2022, 12:13

The command lrtest does not allow to compare models with robust VCE.
In the following, you can find my stata code:

. * coompare models

quietly logit TERMINATED CSRRQ FRQ_111_w i.HOSTILE i.MULTIBID TOEHOLD i.TENDER i.INDUSTRY i.STOCK SIZE_w sdCFO_w GROWTH_w RoE_w MTB_Refinitiv_w LEVERAGE_w
est store M1

quietly logit TERMINATED CSRRQ FRQ_111_w i.HOSTILE i.MULTIBID TOEHOLD i.TENDER i.INDUSTRY i.STOCK SIZE_w sdCFO_w GROWTH_w RoE_w MTB_Refinitiv_w LEVERAGE_w, vce(cluster YEAR)
est store M2

lrtest M2 M1
LR test likely invalid for models with robust VCE
r(498);
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30155
#7

12 Jun 2022, 16:08

Not getting a chi square test for the model with -vce(cluster year)- is precisely because you have only 6 clusters defined by year. Since the number of explanatory variables in the model greatly exceeds that, no model chi square test is possible. There aren't enough degrees of freedom.

Why do you care? Is it really part of your research goals to test the null hypothesis that all those coefficients are simultaneously zero? That's a pretty unusual situation to be in. If that hypothesis test isn't a specific part of your research goals, then just ignore the absence of the model chi square. It's a test of a hypothesis that nobody cares about. If it is a specific goal of your research to test that hypothesis, then you can't use the clustered standard errors. Either use the regular standard errors, or if that is unacceptable, go bootstrap.

As for comparing the models, as you have seen, you cannot do a likelihood ratio test with robust or clustered standard errors. I suggest using AIC or BIC. -estat ic- after each regression will give you those statistics. That said, any of these tests is, in my opinion, a second-rate to compare logistic models even when all of them are available. I would instead compare the discrimination (-lroc-) and calibration (-estat gof, group(10)-) of the two models to see which is more suitable for purpose.
Comment
Thomas Kean

Join Date: Mar 2021

Posts: 9
#8

13 Jun 2022, 02:02

Dear Clyde,

thank you for your very helpful comment. I used AIC and BIC and results clearly showed that regression 2 is better. However, when using the estat gof, group(10) command, for both regression i receive the exact same result (see below). From this, i would conclude that both models are equally good at predicting the regression outcome for the deciles? However, as i have to rejct H0, i have to conclude that the regression is not a good fit? Or could i still go with regression two as the overall number of observations is low?

quietly logit TERMINATED CSRRQ FRQ_111_w i.HOSTILE i.MULTIBID TOEHOLD i.TENDER i.INDUSTRY i.STOCK SIZE_w sdCFO_w GROWTH_w RoE_w
> MTB_Refinitiv_w LEVERAGE_w, vce(robust)

. est store M3

. estat gof, group(10)
note: obs collapsed on 10 quantiles of estimated probabilities.

Goodness-of-fit test after logistic model
Variable: TERMINATED

Number of observations = 101
Number of groups = 10
Hosmer–Lemeshow chi2(8) = 16.28
Prob > chi2 = 0.0385

Thank you again!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30155
#9

13 Jun 2022, 09:08

Sorry, I forgot while responding that your models differed only in the presence or absence of cluster-robust standard errors. Neither -lroc- nor -estat gof- will distinguish these because they reflect only the predicted values, not the standard errors, and those two models produce the same predicted values.

In fact, I overlooked the most important thing regarding choice of standard errors here: you do not have nearly enough clusters to use -vce(cluster)-, regardless of what any test comparing the two models says. Cluster-robust (and ordinary robust) standard errors are large-sample statistics. While there is no universal agreement on how many clusters are needed, nobody thinks that 6 would suffice to produce valid results. I apologize for wasting your time on -estat ic-, -lroc-, and -estat gof-, when your question has such a simple and direct answer. You simply can't use -vce(cluster)- here.
Comment
Thomas Kean

Join Date: Mar 2021

Posts: 9
#10

14 Jun 2022, 11:30

Thank you, Clyde! No apologies necessary, you helped me a lot - and i learned something new
Comment

Announcement

Clustered Mlogit missing Wald Chi

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment