Cluster standard errors with only 5 clusters

Luca Dondi

Join Date: Jun 2016
Posts: 13

Cluster standard errors with only 5 clusters

16 Aug 2016, 09:37

Good Morning,
I am currently working with a PANEL DATA defined as follows:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int year byte(ID n_events) float(time treated did)
2000 1  0 0 0 0
2001 1  0 0 0 0
2002 1  0 0 0 0
2003 1  0 0 0 0
2004 1  0 0 0 0
2005 1  0 0 0 0
2006 1  1 0 0 0
2007 1  1 0 0 0
2008 1  2 0 0 0
2009 1  1 0 0 0
2010 1  1 0 0 0
2011 1  2 1 0 0
2012 1  0 1 0 0
2013 1  0 1 0 0
2014 1  0 1 0 0
2015 1  1 1 0 0
2016 1  2 1 0 0
2000 2  0 0 0 0
2001 2  0 0 0 0
2002 2  0 0 0 0
2003 2  0 0 0 0
2004 2  3 0 0 0
2005 2  0 0 0 0
2006 2  0 0 0 0
2007 2  2 0 0 0
2008 2  6 0 0 0
2009 2  0 0 0 0
2010 2  1 0 0 0
2011 2  2 1 0 0
2012 2  0 1 0 0
2013 2  1 1 0 0
2014 2  1 1 0 0
2015 2  2 1 0 0
2016 2  1 1 0 0
2000 3  0 0 1 0
2001 3  0 0 1 0
2002 3  2 0 1 0
2003 3  3 0 1 0
2004 3  6 0 1 0
2005 3  2 0 1 0
2006 3  2 0 1 0
2007 3  7 0 1 0
2008 3  5 0 1 0
2009 3  3 0 1 0
2010 3  0 0 1 0
2011 3  5 1 1 1
2012 3  4 1 1 1
2013 3  0 1 1 1
2014 3 26 1 1 1
2015 3 10 1 1 1
2016 3 13 1 1 1
2000 4  0 0 0 0
2001 4  0 0 0 0
2002 4  0 0 0 0
2003 4  0 0 0 0
2004 4  0 0 0 0
2005 4  0 0 0 0
2006 4  1 0 0 0
2007 4  0 0 0 0
2008 4  0 0 0 0
2009 4  0 0 0 0
2010 4  0 0 0 0
2011 4  1 1 0 0
2012 4  0 1 0 0
2013 4  0 1 0 0
2014 4  1 1 0 0
2015 4  0 1 0 0
2016 4  0 1 0 0
2000 5  4 0 0 0
2001 5  2 0 0 0
2002 5  9 0 0 0
2003 5 16 0 0 0
2004 5 23 0 0 0
2005 5 13 0 0 0
2006 5 17 0 0 0
2007 5 31 0 0 0
2008 5 28 0 0 0
2009 5  5 0 0 0
2010 5  3 0 0 0
2011 5 31 1 0 0
2012 5  7 1 0 0
2013 5  9 1 0 0
2014 5  3 1 0 0
2015 5  5 1 0 0
2016 5 16 1 0 0
end

where ID identifies the country of departure, n_events is my dependent variable and represents the number of boats departed from each of the country, and did is my interaction term, since I am using a difference in difference strategy. My claim is that after the treatment took place in country 3, the number of events increased compared to the other states.
I have run the following regression:

xtreg n_event did i.year,fe

Given all that, my question is: do you think I should cluster the standard errors at the country level ? if so running the following regression is fine enough?

xtreg n_events did i.year, fe cluster(ID)

However, I've read mostly harmless and they suggest that this procedure should be implemented only when the number of cluster is bigger than 42 (in this case I only have 5 clusters), what should I do?

Thanks for the help,

Kind regards,

Luca

Tags: fixed effects, panel data, regression, standard errors

Stephen Jenkins

Join Date: Apr 2014

Posts: 1388
#2

16 Aug 2016, 09:39

Read Cameron and Miller, 'A practitioner's guide to cluster-robust inference', Journal of Human Resources, 50(2), Spring 2015, 317-372
Comment

Announcement

Cluster standard errors with only 5 clusters

Comment