Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cluster standard errors with only 5 clusters

    Good Morning,
    I am currently working with a PANEL DATA defined as follows:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int year byte(ID n_events) float(time treated did)
    2000 1  0 0 0 0
    2001 1  0 0 0 0
    2002 1  0 0 0 0
    2003 1  0 0 0 0
    2004 1  0 0 0 0
    2005 1  0 0 0 0
    2006 1  1 0 0 0
    2007 1  1 0 0 0
    2008 1  2 0 0 0
    2009 1  1 0 0 0
    2010 1  1 0 0 0
    2011 1  2 1 0 0
    2012 1  0 1 0 0
    2013 1  0 1 0 0
    2014 1  0 1 0 0
    2015 1  1 1 0 0
    2016 1  2 1 0 0
    2000 2  0 0 0 0
    2001 2  0 0 0 0
    2002 2  0 0 0 0
    2003 2  0 0 0 0
    2004 2  3 0 0 0
    2005 2  0 0 0 0
    2006 2  0 0 0 0
    2007 2  2 0 0 0
    2008 2  6 0 0 0
    2009 2  0 0 0 0
    2010 2  1 0 0 0
    2011 2  2 1 0 0
    2012 2  0 1 0 0
    2013 2  1 1 0 0
    2014 2  1 1 0 0
    2015 2  2 1 0 0
    2016 2  1 1 0 0
    2000 3  0 0 1 0
    2001 3  0 0 1 0
    2002 3  2 0 1 0
    2003 3  3 0 1 0
    2004 3  6 0 1 0
    2005 3  2 0 1 0
    2006 3  2 0 1 0
    2007 3  7 0 1 0
    2008 3  5 0 1 0
    2009 3  3 0 1 0
    2010 3  0 0 1 0
    2011 3  5 1 1 1
    2012 3  4 1 1 1
    2013 3  0 1 1 1
    2014 3 26 1 1 1
    2015 3 10 1 1 1
    2016 3 13 1 1 1
    2000 4  0 0 0 0
    2001 4  0 0 0 0
    2002 4  0 0 0 0
    2003 4  0 0 0 0
    2004 4  0 0 0 0
    2005 4  0 0 0 0
    2006 4  1 0 0 0
    2007 4  0 0 0 0
    2008 4  0 0 0 0
    2009 4  0 0 0 0
    2010 4  0 0 0 0
    2011 4  1 1 0 0
    2012 4  0 1 0 0
    2013 4  0 1 0 0
    2014 4  1 1 0 0
    2015 4  0 1 0 0
    2016 4  0 1 0 0
    2000 5  4 0 0 0
    2001 5  2 0 0 0
    2002 5  9 0 0 0
    2003 5 16 0 0 0
    2004 5 23 0 0 0
    2005 5 13 0 0 0
    2006 5 17 0 0 0
    2007 5 31 0 0 0
    2008 5 28 0 0 0
    2009 5  5 0 0 0
    2010 5  3 0 0 0
    2011 5 31 1 0 0
    2012 5  7 1 0 0
    2013 5  9 1 0 0
    2014 5  3 1 0 0
    2015 5  5 1 0 0
    2016 5 16 1 0 0
    end
    where ID identifies the country of departure, n_events is my dependent variable and represents the number of boats departed from each of the country, and did is my interaction term, since I am using a difference in difference strategy. My claim is that after the treatment took place in country 3, the number of events increased compared to the other states.
    I have run the following regression:

    xtreg n_event did i.year,fe

    Given all that, my question is: do you think I should cluster the standard errors at the country level ? if so running the following regression is fine enough?

    xtreg n_events did i.year, fe cluster(ID)

    However, I've read mostly harmless and they suggest that this procedure should be implemented only when the number of cluster is bigger than 42 (in this case I only have 5 clusters), what should I do?

    Thanks for the help,

    Kind regards,

    Luca

  • #2
    Read Cameron and Miller, 'A practitioner's guide to cluster-robust inference', Journal of Human Resources, 50(2), Spring 2015, 317-372

    Comment

    Working...
    X