alternative for clustered standard errors when having too few clusters

Hanna Lanzinger

Join Date: Jun 2019

Posts: 8
#1

alternative for clustered standard errors when having too few clusters

09 Jul 2019, 10:09

Hey everyone,

I have observations on management scores from firms which are nested in countries. This means that observations are clustered.
In order to account for this clustering I first thought of using clustered standard errors. Unfortunately I only have 18 countries and therefore only 18 clusters which means that using clustered standard errors would cause small sample bias.
What is an alternative to use in this case??

I also thought about using a fixed effects model in order to account for the onobserverd heterogeneity, but as my key explanatory variable only varies across countries and not across firms this is not possible.
So my second question would be if I should do -xtreg,re- or simply -reg- ?

My model looks as follows: My key explanatory variable is PDI which only varies across countries and not over firms.

Management_ij= a + b₁* PDI_j + b₂* x_ij+ e_ij

Thanks for any help in advance!!

Best,
Hanna
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#2

10 Jul 2019, 02:37

Hanna:
you may want to try -bootstrap- standard errors.

Kind regards,
Carlo
(Stata 19.0)
Comment

Hanna Lanzinger

Join Date: Jun 2019
Posts: 8

10 Jul 2019, 14:00

Hi Carlo,

thank you for your answer.
I also thought about using bootstrapped standard errors, I just thought that there might be a special kind (like for e.g. block bootstrap) to use for the above described case.

Moreover I've got another question:

I ran the following regression

Code:

reg management_mcs pdi_100 i.ownership firm_size_1000 firm_size_sq_1000000

then I wanted to test for heteroscedasticity and I get two different results by using -estat hettest- and -estat imtest, white-

Code:

estat hettest

Breusch-Pagan / Cook-Weisberg test for heteroskedasticity 
         Ho: Constant variance
         Variables: fitted values of management_mcs

         chi2(1)      =     1.43
         Prob > chi2  =   0.2323



estat imtest, white

White's test for Ho: homoskedasticity
         against Ha: unrestricted heteroskedasticity

         chi2(41)     =     69.87
         Prob > chi2  =    0.0033

Cameron & Trivedi's decomposition of IM-test

---------------------------------------------------
              Source |       chi2     df      p
---------------------+-----------------------------
  Heteroskedasticity |      69.87     41    0.0033
            Skewness |      19.14     11    0.0587
            Kurtosis |       0.46      1    0.4977
---------------------+-----------------------------
               Total |      89.46     53    0.0013
---------------------------------------------------

So my question now is: Which one is the appropriate test to use??

Best,
Hanna

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#4

11 Jul 2019, 01:26

Hanna:
as far as your last question is concerned, you can find https://www.statalist.org/forums/for...roskedasticity really helpful.

Kind regards,
Carlo
(Stata 19.0)
Comment
Hanna Lanzinger

Join Date: Jun 2019

Posts: 8
#5

16 Jul 2019, 23:55

So I found different techniques which can be used as an alternative for clustered standard errors when there are (too) few clusters:

- wild cluster bootstrapped t-statistics
- Block bootstrapped t-statistics
- cluster-ajusted t-statistics

unfortunately, non of them does work with a Random Effects model.
Because my key explanatory variable is time-invariant, I can't use a fixed effects model and need to use a random effects model.

So my Question is:

Is there any procedure implemented in STATA as an alternative to the above stated procedures for RE MODELS??

Best,
Hanna
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#6

17 Jul 2019, 00:46

Hanna:
I'm not sure I got you right.
In your original post you shared a -regress- code, that implies one wave of data only.
Now I see that you are switching to -xtreg-, that implies =>2 waves of data.
What happened in between the two posts?

Kind regards,
Carlo
(Stata 19.0)
Comment
Hanna Lanzinger

Join Date: Jun 2019

Posts: 8
#7

26 Jul 2019, 00:33

Hello Carlo,

it's not that I have Paneldata in the sense of =>2 waves of data.
I have "hierarchical" data, which means that observations are on firm level and these firms are nested within countries. Therefore it is also possible to do xtreg..

in the sense of:

Code:

xtset country firm

and not like:

Code:

xtset country time

Best regards,
Hanna
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#8

26 Jul 2019, 03:00

Hanna:
if you have a nesting design, the first choice would be -mixed-, which results may overlap with -xtreg, re mle-.

Kind regards,
Carlo
(Stata 19.0)
Comment
Ole Petersen

Join Date: Jan 2020

Posts: 1
#9

31 Jan 2020, 07:10

Dear everyone,
This is my first post here, so I hope I do it right. Concerning alternatives to clustered standard errors when having too few clusters, I appreciate Carlo's advice to use -mixed- which makes sense for nested data. But my understanding is that a -mixed- does not solve the underlying problem that this post concerned, which are alternatives for clustered standard errors when having very few clusters like 5 or 10 clusters. Does nesting in a mixed model solve that problem?

I have a data set with about 300 observations (dependent variable is companies' cost when bidding for government contracts as a fraction of total contract spending) and use fractional regression because of the fractional nature of the dependent variable (bound between 0 and 1). I have 5 industry clusters. I did read a number of previous posts here, links to excellent econometric articles posted by statalist colleagues, and program packages offered in this forum. But I did not find a suitable command for estimating fractional response regression with few clusters - in my case five clusters. Can anyone help?

Many thanks in advance and I hope I filed this message correct.

Kind regards,
Ole
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#10

31 Jan 2020, 11:44

Ole:
welcome to this forum.
Obviously, you're correct with stating that -mixed- does not resolve the -cluster- issue. My advice focused, as you surmised, on the nested structure of Hanna's dataset.
That said, probably 5 clusters are too few to impose such non-defaut standard error.
I would probably stick with default standard errors and add -i.industry- as a predictor.
Eventually, I would check whether -bootstrap- standard errors give back different results from their default counterparts.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement