Use of vce(cluster) legitimate?

Ellen Sterk

Join Date: Dec 2022

Posts: 21
#1

Use of vce(cluster) legitimate?

14 Dec 2022, 06:42

Hi all,

I am analysing a Discrete Choice Experiment with the cmxtmixlogit command. I had three different types of construction clients in my sample, which is why I thought to use "vce(cluster type of client)". However, now I came across entries that said the number of clusters should be "sufficient" and three did not seem to meet this criterion. So should I just use "vce(robust)" instead? How can I control for the different groups I have?
Note: the experiment was unlabelled, so entering type of client as a case variable is meaningless.

Thank you!
Tags: None
George Ford

Join Date: Aug 2014

Posts: 3138
#2

14 Dec 2022, 09:16

3 too few. But could try it then boottest after.
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17704
#3

14 Dec 2022, 09:31

Ellen:
welcome to this forum.
With three clusters only, -vce(cluster clusterid)- standard errors are surely misleading.
In addition, -vce(robust)- won't help either, as quoting -cmxtmixlogit- entry, Stata .pdf manual:

Specifying vce(robust) is equivalent to specifying
vce(cluster panelvar), where panelvar is the variable that identifies the panels.

Kind regards,
Carlo
(Stata 19.0)
2 likes
Comment
Ellen Sterk

Join Date: Dec 2022

Posts: 21
#4

20 Dec 2022, 07:04

Thank you, George and Carlo, for your answers! That already helps a lot.
@Carlo: the variable that identifies my panel is actually the individuals (> 1000), they all made 8 choices. Then -vce(robust)- would be ok, right?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17704
#5

20 Dec 2022, 08:25

Yes, but -vce(cluster clusterid)- would work out fine, too.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Miguel Henry

Join Date: Oct 2015

Posts: 9
#6

20 Dec 2022, 16:53

Kezdi, Gabor. 2004. “Robust Standard Error Estimation in Fixed-Effects Panel Models.” Hungarian Statistical ReviewSpecial(9): 96-116 shows that 50 clusters (with roughly equal cluster sizes) are often close enough to infinity for accurate inference.
1 like
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2149
#7

20 Dec 2022, 19:43

I'll just add to the helpful remarks: You need a good reason to cluster on a variable. I don't see how clustering on type of client makes sense. Why not cluster on, say, years of schooling? Race? Experience in the workforce?

You have a large number of individuals and, as best I can tell, the assignment of the scenarios was done at the individual level. If so, cluster at the individual level and no more.
2 likes
Comment
Ellen Sterk

Join Date: Dec 2022

Posts: 21
#8

20 Dec 2022, 23:52

These are very helpful, thank you all!
Comment

Announcement

Use of vce(cluster) legitimate?

Comment

Comment

Comment

Comment

Comment

Comment

Comment