Firm fixed effects and Robust Standard Errors Clustered at the Country-Year Level

Daniela Fuji

Join Date: Jul 2017

Posts: 15
#1

Firm fixed effects and Robust Standard Errors Clustered at the Country-Year Level

03 Aug 2017, 11:08

Dearest, I have read a lot of the threads before posting this question, however, did not seem to get an answer for it. I have panel data (firms and years). The firms are from different countries and I want to run a regression with Firm fixed effects, however, I want to have robust and clustered standard errors at the country-year level. I tried using : xtreg, cluster at country-year level, however, it provides error "panels are not nested within clusters". And as I understand, it is because I have repeated countries values within the same panel. I have already created a new panelid that mirrors each firm/country combination and then clustered on it, however I still get the same error.. What would be an approach to do this? Please bear with me, I am a new Stata user.
Any insight would be greatly appreciated!!

Code:

xtreg y x1 x2 x3 x4 x5 i.year, fe cluster(country year)

gives me error: "panels are not nested within clusters".

and:

Code:

egen X=group(id country)

where id is the firm id.

Code:

xtreg y x1 x2 x3 x4 x5 i.year, fe vce (cluster X year)

panels are not nested within clusters

Last edited by Daniela Fuji; 03 Aug 2017, 11:11.
Tags: None
Sergio Correia

Join Date: Apr 2014

Posts: 420
#2

03 Aug 2017, 12:31

This should work:

Code:

ssc install reghdfe reghdfe y x1 x2 x3 x4 x5 , absorb(id year) vce(cluster id#country)
Comment
Daniela Fuji

Join Date: Jul 2017

Posts: 15
#3

03 Aug 2017, 12:53

Thank you very much for your reply!

I ran the code and I noticed that the output I get is exactly the same output that I get when running this code (coefficients, their significance and Standard Errors are the same):

Code:

xtreg y x1 x2 x3 x4 x5 i.year, fe vce(robust)

which as you can see, clusters at the firm level, because my panel variable is firm id.

Why would this be the case?

Thank you again for your help! Very much appreciated!
Comment
Sergio Correia

Join Date: Apr 2014

Posts: 420
#4

03 Aug 2017, 13:50

My bad, if you want to have "standard errors at the country-year level" (i.e. one cluster per country-year tuple), then you need to do "vce(cluster country#year)".

If you just do as now (cluster by id#country), it would be the same as clustering by id (because firms don't change country), and that explains why you got the same results
1 like
Comment
Daniela Fuji

Join Date: Jul 2017

Posts: 15
#5

03 Aug 2017, 14:25

Originally posted by Sergio Correia View Post

My bad, if you want to have "standard errors at the country-year level" (i.e. one cluster per country-year tuple), then you need to do "vce(cluster country#year)".

If you just do as now (cluster by id#country), it would be the same as clustering by id (because firms don't change country), and that explains why you got the same results

Dear Sergio,

Of course, yes! I understand now.

Thank you!

Last edited by Daniela Fuji; 03 Aug 2017, 14:59.
Comment
Daniela Fuji

Join Date: Jul 2017

Posts: 15
#6

05 Aug 2017, 15:48

Dear Sergio,
I have been trying for the past two days to estimate the model, and used your code as well, however, all of my coefficients are insignificant. Is there any other approach that you may be able to suggest, such that I can have firm fixed effects with robust and clustered standard errors at the country-year level.

As always, I would really appreciate your help!
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10213
#7

07 Aug 2017, 03:28

First, significance should play no role in your decision on what level to cluster your standard errors. Second, it is unusual to have country-year clusters and perhaps you can provide some more information about your sample and your reasoning behind wanting to cluster at this level. For a panel consisting of several firms observed over a number of years, you usually cluster at the firm level because you (rightly) believe that observations of a given firm are not independent over time. For example, if my variable is the number of workers and General Electric (GE) is one of the firms in my sample, I would be justified in believing that there is a strong correlation between the number of workers in successive years, e.g., 2015, 2016 and 2017. However, for another firm in the sample, say Walmart, the number of workers in 2015 will depend little on how many workers GE has in the same year. In a cross-country panel of firms, clustering at the country level implies that observations of firms belonging to the same country are not independent. Partly, this is true because you have country level macro factors that affect firms located in a given country e.g, the political environment, taxation, inflation, GDP growth expectations, etc. However, assume that some firms are in retail and others in mining. If my variable of interest is net income, It may be the case that the retail sector for a large part of the sample period was suffering as a result of low consumer confidence leading lower margins on sales whereas strong demand from abroad was boosting earnings of the mining firms. If such trends persisted between different sectors over my sample period, I would generally not be justified in concluding that there is a strong interdependence between observations of firms in the same country but in different industries. Industry clusters would therefore be more appropriate. The time interdependence is generally more compelling, hence the pervasiveness of clustering at the firm level in panel data.
2 likes
Comment
Daniela Fuji

Join Date: Jul 2017

Posts: 15
#8

07 Aug 2017, 06:42

Andrew, many thanks for you detailed explanation! It is really helpful!
My sample consists of 525,300 firms from 50 countries, observed over 20 years. However, since my dependent variable of interest is measured at the firm-level, and my independent main variable of interest is measured at the country level (and as you mentioned the country level macro factors affect the firms located in a given country), I believe standard errors should ideally be clustered at the country-year level to address within country correlation in residuals. Does this make sense?

Any suggestions of yours are very welcome and I greatly appreciate your help!
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10213

07 Aug 2017, 08:25

Clustering at the country level is fine if you believe that the interdependence exists within countries. However, by clustering at the country-year level, you are constraining this interdependence to particular years: Observations of firms in China in 2015 are not independent but these observations are independent to those of China in 2016. This is a very strong and precarious assumption since the observations mostly belong to the same firms i.e., you are ruling out temporal interdependence. My suggestion is to just cluster at the country level. The following example illustrates how the cluster groupings are constructed

Code:

*// The Grunfeld dataset contains 200 observations of 10 firms observed over a 20 year period.

webuse grunfeld

*\\ ASSUME THAT THE FIRMS ARE LOCATED IN 5 COUNTRIES (2 FIRMS PER COUNTRY)
gen country= 1 if inlist(company, 1,2)
replace country= 2 if inlist(company, 3,4)
replace country= 3 if inlist(company, 5,6)
replace country= 4 if inlist(company, 7,8)
replace country= 5 if inlist(company, 9,10)

*\\ COUNTRY-YEAR GROUPING
egen countryyear= group(country year)
sort countryyear
l in 1/10, sepby(countryyear)
l in -10/l, sepby(countryyear)

Firms in the same country and year are grouped together, but not in different years.

Code:

     +-----------------------------------------------------------------------+
     | company   year   invest   mvalue   kstock   time   country   countr~r |
     |-----------------------------------------------------------------------|
  1. |       1   1935    317.6   3078.5      2.8      1         1          1 |
  2. |       2   1935    209.9   1362.4     53.8      1         1          1 |
     |-----------------------------------------------------------------------|
  3. |       1   1936    391.8   4661.7     52.6      2         1          2 |
  4. |       2   1936    355.3   1807.1     50.5      2         1          2 |
     |-----------------------------------------------------------------------|
  5. |       1   1937    410.6   5387.1    156.9      3         1          3 |
  6. |       2   1937    469.9   2676.3    118.1      3         1          3 |
     |-----------------------------------------------------------------------|
  7. |       1   1938    257.7   2792.2    209.2      4         1          4 |
  8. |       2   1938    262.3   1801.9    260.2      4         1          4 |
     |-----------------------------------------------------------------------|
  9. |       1   1939    330.8   4313.2    203.4      5         1          5 |
 10. |       2   1939    230.4   1957.3    312.7      5         1          5 |
     +-----------------------------------------------------------------------+


     +-----------------------------------------------------------------------+
     | company   year   invest   mvalue   kstock   time   country   countr~r |
     |-----------------------------------------------------------------------|
191. |       9   1950    43.48    274.6      376     16         5         96 |
192. |      10   1950     3.42    69.05     8.74     16         5         96 |
     |-----------------------------------------------------------------------|
193. |       9   1951    56.49    339.9      391     17         5         97 |
194. |      10   1951     4.67    83.04     9.07     17         5         97 |
     |-----------------------------------------------------------------------|
195. |      10   1952        6    74.42     9.93     18         5         98 |
196. |       9   1952    65.98    474.8      414     18         5         98 |
     |-----------------------------------------------------------------------|
197. |       9   1953    66.11      496      443     19         5         99 |
198. |      10   1953     6.53    63.51    11.68     19         5         99 |
     |-----------------------------------------------------------------------|
199. |       9   1954    49.34    474.5      468     20         5        100 |
200. |      10   1954     5.12    58.12    14.33     20         5        100 |

So, these two commands are equivalent

Code:

*TO INSTALL TYPE ssc install reghdfe
reghdfe invest mvalue kstock ,  absorb(company year) vce(cluster country#year)
reghdfe invest mvalue kstock ,  absorb(company year) vce(cluster countryyear)

Comment

Daniela Fuji

Join Date: Jul 2017

Posts: 15
#10

07 Aug 2017, 11:46

Dear Andrew, this is really helpful! Can't thank you enough!
So, correct me if I am wrong, but this would be the code to only cluster at the country level:

Code:

xtreg y x1 x2 x3 x4 x5 i.year, fe vce(cluster country)

Thank you again!
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10213
#11

07 Aug 2017, 15:28

That is correct. You may also

Code:

xtreg y x1 x2 x3 x4 x5 i.year, fe cluster (country)
1 like
Comment
Daniela Fuji

Join Date: Jul 2017

Posts: 15
#12

08 Aug 2017, 03:28

You've helped me a lot! Thank you!!
Comment
Daniela Fuji

Join Date: Jul 2017

Posts: 15
#13

12 Aug 2017, 14:32

I apologize for bringing this thread up again, however, I have a concern.
Since I am following previous literature on my analysis, I have come to notice that they do not only include Year fixed effects, but also Country fixed effects. As explained above, my dataset is panel, with firms operating in different countries and observed over a timespan of 20 years. I have set the panel as:

Code:

xtset id year

I run :

Code:

reghdfe y, x1 x2 x3 x4 x5, absorb(id year) vce(cluster country#year)

Maybe this is too straighforward, but I am failing to understand it at this moment. Does this also capture country fixed effects?
As, if I also include "country":

Code:

reghdfe y, x1 x2 x3 x4 x5, absord(id year country) vce(cluster country#year)

I get the message that :

country | 0 158 158 ? |
-----------------------------------------------------------------------+
? = number of redundant parameters may be higher

And, If I only cluster at the country-level, by running:

Code:

xtreg y x1 x2 x3 x4 x5 i.year i.country, fe cluster(country)

Countries get ommited because of collinearity.
I would appreciate any insights on how to approach this. And apologize again if this has a very straighforward answer that I am failing to see because of my limited Stata knowledge.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10213
#14

13 Aug 2017, 13:34

To include country and year fixed effects, you have to specify country as the panel variable

Code:

xtset, clear xtset country year xtreg y x1 x2 x3 x4 x5 i.year, fe cluster(country)

If your panel variable is firm, you cannot include country dummies since these will be collinear with the firm fixed effects (firms are within countries). See also

https://www.statalist.org/forums/for...arity-using-fe
1 like
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10213
#15

15 Aug 2017, 04:07

I realize that the advice in #14 was a bit hasty and will throw in an error of repeated time values within panel unless you have exactly one firm in one country. The important thing to note is that inclusion of firm dummies does take into account heterogeneity among countries. Therefore country dummies are not necessary in the presence of firm dummies.

Code:

xtset firm year xtreg y x1 x2 x3 x4 x5 i.year, fe cluster(country)
1 like
Comment

Announcement