Running Fixed effects at different level to id

Krissy Philips

Join Date: Nov 2016

Posts: 63
#16

13 Feb 2017, 09:55

Thanks Clyde for your advice.

1) I totally agree, would not want to use something which is incorrect. Want to clarify when you say firm level effects, I assume you mean unobserved heterogeneity that hasn't been controlled for, as book value, market cap, stock beta, lag return and a few other firm variables have in fact been controlled for. Dependent on this, how would I show that firm level effects are 0?

These I think, on a firm level would be the main firm effects on stock return, other effects would pertain more to the country, hence the country dummies.

2) In this light I wanted to double check - country dummies with RE is essentially the same as country FE as per regdhfe command? They are yielding almost same results in Stata so I guess so.

3) I also forgot to mention with company specific FE a few problems which made me reluctant to use them, would really appreciate your thoughts on:

a) In my regression there are variables of interest such as a "sin" dummy if the company is a sin stock, along with other company time invariant measures (beta and beverage dummy) which are omitted due to company FE, which means computation of the Hausman would not be accurate and force me to use RE (using the Mundlak to determine adequacy)

b)Using company fixed effects gives highly insignificant estimators under both RE and FE, which is understandable given that I have more than 11000 companies over 120 time periods, therefore I thought that at a company level, almost all variation is extracted, leaving no significance/analysis to look at. Country fe (simulated or regdhfe yield significance)

I therefore was seeing country fixed effects as a nice way around the above problems in the research - would you agree?

Last edited by Krissy Philips; 13 Feb 2017, 10:00.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#17

13 Feb 2017, 10:25

1. I don't work in finance or economics, so I can't comment on things like book value, etc., or whether your list is comprehensive. The variables you mention look, to me, to change over time within firm. So if they are relevant to the outcome and likely to differ between sin-industries and non-sin-industries, then they do need to be included explicitly as covariates. But the other general principle is that any time-invariant firm-level attributes that might affect the outcome also need to be dealt with to avoid omitted variable bias. In addition, there is the issue of non-independence of observations on the same firm over time. So the standard errors require adjustment for that. The use of a cluster robust VCE takes care of the latter but not the former. For the former you need fixed-effects or random-effects at the firm level in the model, unless you can show they are zero. I think the simplest way to test whether they are zero is to look at the very last line of output from -xtreg, fe- where you have -xtset firm-. That line will read "F test that all u_i = 0:..." and gives you the F statistic and p-value. If you reject the hypothesis that all u_i = 0, then the firm-level effects are not zero and you need to retain them. If you do not reject that hypothesis, then you could consider leaving them out, assuming you have a large number of firms so it isn't just a matter of an under-powered test. (Note that this -xtset firm- -xtreg, fe- sequence I am recommending to determine whether firm level fixed effects are important or not is not the same as what I am recommending as your actual analysis. This would be a preliminary step before the analysis. If I had to make a bet, I would bet that you will not reject that all u_i = 0 hypothesis.)

2. I don't understand this question. Perhaps if you showed the actual commands for each that you are considering...

3. a. Yes, effects of any time-invariant attribute of a firm (and it certainly seems like being a sin-stock would be one of those) are inherently not estimable in a firm-level fixed effects model. Since these effects are a key aspect of your research goals, that would rule out using a firm-level fixed-effects model for your analysis. b. I would never make a decision about which model to use on the basis of what it does or does not declare statistically significant. The choice of model needs to be made on the basis of the model's inherent properties and ability to estimate the effects that are relevant to the research question. In an ideal world, the model is chosen before you even gather the data. In observational studies, that is seldom possible and rarely practical even when possible. But it should certainly be done before you actually do the analysis. You can change that decision if the predicted values from the model (not the model coefficients or p-values) show poor fit to the data. You can change that decision if you can't get the model to converge. But never pick a model on the basis of what comes out statistically significant! p-values obtained from that process are inherently meaningless.

Based on our dialog in this thread, my sense is that the model should probably be a random-effects model at the firm level, with firm-level cluster robust VCE, and fixed effects (implemented as indicator variables) for country. So something along the lines of:

Code:

xtset firm year xtreg outcome i.sin_industry i.country /*other covariates*/, re vce(cluster firm)

If, contrary to my expectation, you do find that the firm-level fixed effects are ignorable (see 1 above), then you could do

Code:

xtset country year xtreg outcome i.sin_industry /*other covariates*/, fe
Comment
Krissy Philips

Join Date: Nov 2016

Posts: 63
#18

13 Feb 2017, 14:25

Is i.sin_industry representing sin dummy: "sin"?

1. Yes , indeed I think what would be needed is to ask what firm specific effects that are time invariant could net return and would not be picked up by the country dummies already. But according to what you though, it looks as if I will reject the null, therefore firm effects will have to be retained below: [ATTACH=CONFIG]n1374139[/ATTACH]

2. Sorry I meant (Sergio's written command for fixed effects at a different level): reghdfe netreturn sin religiositymean sinreligiositymean l.beta l.return l.lmarketcap l.lpb bev lgdp l.spread l.inflationrate open law year1 year2 year3 year4 year5 year6 year7 year8 year9 year10, absorb(i.country_c) is essentially the same as (simulated country fe): xtreg netreturn sin religiositymode sinreligiositymode l.beta l.return l.lmarketcap l.lpb bev lgdp l.spread l.inflationrate open law year1 year2 year3 year4 year5 year6 year7 year8 year9 year10 i.country_c, re

3.b is dually noted, point a. is now where my dilemma lies, indeed the inability to estimate "sin" does after all the rule out company fixed effects on this basis. So I am left with the two above estimations as options. One, if I'm correct, simulating the estimation of the other.

If using reghdfe- there is no random effects estimation for estimating a different level (country), which I can use in order to compute the Hausman, leaving me with no formal test of choosing this fixed effects. If using second method ( re with simulated country FE) therefore I will have to use the Mundlak version of the Hausman to determine adequacy. But given I have found in 1. that company FE are not ignorable this leaves with with the decision of being able to estimate what's important to the research and going ahead with the re model you suggesting and using the Mundlak to determine adequacy, or finding a way round the fact that firm fixed effects are significant. Of course no way it perfect! So I think the re with sim country dummies is the way to go. If the Mundlak shows that re is inefficient, I can use FE reghdfe- in 2. which (based on your answer) is simulating the same equation, albeit at a different level?
Attached Files

Last edited by Krissy Philips; 13 Feb 2017, 15:03.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#19

13 Feb 2017, 19:10

Yes sin_industry in my code was your sin indicator.

1. Yes, there is a clear rejection of the null hypothesis here, so you cannot ignore firm-level effects.

2. I'm still a little confused by #2. In one command you have religiositymean and sinreligiositymean, and the other you have religiositymode and sinreligiositymode. What's that about. In any case, I don't think that -xtreg, re- and -reghdfe- can be equivalent because, as far as I'm aware, -reghdfe- supports random effects. It wouldn't necessarily surprise me to learn that the results of the two models were not all that different, but it definitely would surprise me if they turned out to be the same. By the way, don't calculate a separate variable sinreligiosity. Use factor variable notation: i.sin##c.religiosity

3b. Yes, that is your dilemma. But perhaps there is a bright side. While sin is a fixed firm-level attribute, religiosity is not. In fact it's a country level attribute, and it varies over time. So the sin#religiosity interaction is estimable, even though sin by itself is not. Might that be sufficient for your research goals? That is, with a firm fixed-effects model you could estimate the extent to which national religiosity modifies the sin "penalty" even though you would not be able to estimate the sin penalty at any given level of religiosity. If that's not sufficient and you need to estimate the sin "penalty" itself, then a firm-level fixed effects model is simply out of the question. Random-effects will accommodate this.

In terms of your overall conclusion, I will just reiterate that I do not see the -reghdfe- approach as being equivalent to the -xtreg, re- approach. But perhaps I am missing something here.
Comment
Krissy Philips

Join Date: Nov 2016

Posts: 63
#20

14 Feb 2017, 07:12

2. Sorry they should just be either mean or mode consistently, I've just used two methods for conducting the religiosity index. When you say reghdfe supports random effects what do you mean? Because it is a FE specific command, therefore, cannot be used to estimate to estimate RE at a different level to the id level. Yes they are essentially the same but not equal as seen attached.

I saw reghdfe- approach as being equivalent to the -xtreg, re- approach in theory, as it is using country fixed effects and ignoring firm level effects. Where as xtreg, re- with country dummies is essentially controlling for country effects as well and ignoring firm level effects.
Attached Files
Comment
Krissy Philips

Join Date: Nov 2016

Posts: 63
#21

15 Feb 2017, 09:24

Originally posted by Clyde Schechter View Post

By the way, don't calculate a separate variable sinreligiosity. Use factor variable notation: i.sin##c.religiosity

Also to add, whether I use sin##c.religiositymean or calcuate sin*rm shouldn't make a difference to results? Using the forming causes both the individual terms (not the interaction) to be ommitted due to collinearity? Very odd - do you know why this is the case?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#22

15 Feb 2017, 12:43

Re #20. Sorry, that was a typo on my part. I meant to say that -reghdfe- does not support random effects. And seeing the two models you are referring to, yes they are equivalent.

Re #21: Whether you use factor variable notation or a pre-calculated sin*rm variable will not make any difference in the immediate results. But only if you use factor variable notation will you be able to use the -margins- command afterward. Since it's difficult to make sense of interaction model results, particularly when one of the variables is continuous, without using -margins-, it is really very desirable to use factor variable notation.

If you have firm level fixed effects in the model (whether due to -xtset firm, fe-, or the inclusion of i.firm in the regression model), your sin variable is constant within firm, and is omitted due to colinearity with the fixed effects. This is perfectly normal in fixed-effects models and is not a problem. As for the religiosity variable, I do not understand why it would be omitted. You said in one of the earlier posts of this thread that this variable changes over time, so it is not colinear with country or firm, and it should stay in the model. The use of factor variable notation wouldn't affect that either way. (Note: If you are using factor variable notation with a continuous variable, you have to use the c. prefix with the continuous variable. sin##religiosity is wrong, because with no specification, Stata assumes the components of an interaction variable are discrete. So it has to be i.sin##c.religiosity.)
Comment
Krissy Philips

Join Date: Nov 2016

Posts: 63
#23

15 Feb 2017, 13:06

Many thanks for your help. Yes, but what is odd is that ommission still occurs: 1) xtreg netreturn sin religiositymean sinreligiositymean l.beta l.return l.lmarketcap l.lpb bev lgdp l.spread l.inflationrate open law year1 year2 year3 year4 year5 year6 year7 year8 year9 year10 i.country_c, re causes nothing to be ommitted where as

2) xtreg netreturn sin religiositymean i.sin##c.religiositymean l.beta l.return l.lmarketcap l.lpb bev lgdp l.spread l.inflationrate open law year1 year2 year3 year4 year5 year6 year7 year8 year9 year10 i.country_c, re as seen attached??

I have also run into an issue, where I cannot run the Mundlak test to determine the efficiency of the re model as means cannot be calculated on the lagged variables. Leaving me with no test to justify my use of estimation methods! Hausman was ruled out due to inability to use company fe/estimate re at country level.

Given also I just realised the equation is dynamic in nature (netreturn=return-riskfreerate, and I have lagged l.return on RHS), I realized I should use xtabond perhaps, adding another complication into the mix!
Attached Files
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#24

15 Feb 2017, 14:51

OK, now I see what you're talking about with the omitted variables. You need to read the factor variables help file (-help fvvarlist-) and the corresponding manual section. When you specify i.sin##c.religiositymean, Stata responds to that by trying to add three variables to the model: i.sin, c.religiositymean and i.sin#c.religiositymean. But you already have sin and religiositymean by themselves in the model. And i.sin and c.religiositymean are exactly the same thing as those two variables, so they drop. When you use the ## notation, you don't separately specify the components. So the command should be:

Code:

xtreg netreturn i.sin##c.religiositymean l.beta l.return l.lmarketcap /// l.lpb bev lgdp l.spread l.inflationrate open law /// year1 year2 year3 year4 year5 year6 year7 year8 year9 year10 /// i.country_c, re // NOTE THAT sin and religiositymean FROM PREVIOUS COMMAND ARE NOT LISTED

Trivial point on style: lower case l (ell) is a difficult letter to read: it can look like an upper case I (eye), depending on the typeface. So to make my code as readable as possible, I usually specify lag operators using upper case L. Stata doesn't care which you use, but L is just easier for human eyes.

As for the Mundlak test and dynamic estimation, you are getting into details of econometric analysis that I simply don't know about and can't help you with.
Comment
Krissy Philips

Join Date: Nov 2016

Posts: 63
#25

18 Feb 2017, 17:11

Originally posted by Clyde Schechter View Post

1.

Based on our dialog in this thread, my sense is that the model should probably be a random-effects model at the firm level, with firm-level cluster robust VCE, and fixed effects (implemented as indicator variables) for country. So something along the lines of:

Code:

xtset firm year xtreg outcome i.sin_industry i.country /*other covariates*/, re vce(cluster firm)

If, contrary to my expectation, you do find that the firm-level fixed effects are ignorable (see 1 above), then you could do

Code:

xtset country year xtreg outcome i.sin_industry /*other covariates*/, fe

Hi Clyde, thank you for the help and the tips (dually noted). With regard to clustering, I agree with you wrt the level should be firm, but wanted to clarify the assumption I am making in clustering at firm level.

1.By clustering at firm level, I am assuming that firms across countries are more similar than firms within same country. So e.g. firms within same country very different as different industries etc etc, but firms across countries in (a similar industry) are similar – India alcohol company is similar to japan alcohol comp vs an Indian alcohol firm sim to Indian tobacco firm.

2. By clustering at a country level, assuming that firms within countries more similar than firms across countries, means cluster at country level. The unobservables which affect stocks are similar within countries. I have read that clustering of SE should take place at the highest level (so in this case coutnry), so wasn't sure if my 1. rational overode this...
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#26

18 Feb 2017, 18:17

By clustering at firm level, I am assuming that firms across countries are more similar than firms within same country. So e.g. firms within same country very different as different industries etc etc, but firms across countries in (a similar industry) are similar – India alcohol company is similar to japan alcohol comp vs an Indian alcohol firm sim to Indian tobacco firm.

I have read that clustering of SE should take place at the highest level (so in this case coutnry), so wasn't sure if my 1. rational overode this...

There are two different technical aspects of clustering at play here. One of them is the panel variable you select with your -xtset- command. The other is what you specify in your -vce(cluster)- option. They do not have to be the same. You can specify these separately (with some limitations.)

First let's talk about the panel variable level specified in -xtset-. By clustering at the firm level, you are implicitly assuming that observations within a given firm may be more similar than observations from different firms. Or, actually, you are not so much assuming that this is true as allowing for, and accounting for, that possibility (which, in real life, is almost always true). This aspect of "clustering" (which I prefer to call nesting to avoid confusion with the other kind of clustering) is accomplished, in -xtreg- by, in effect (though this is not how the calculations are done) including an indicator variable for each panel. So, it is much like doing a regression analysis in which each panel has a customized constant term that characterizes its observations and distinguishes those from the observations in other panels. It is this process that corrects for omitted variable bias attributable to time-invariant attributes of panels. Nesting should be specified at the lowest level possible. In your case, that means you should -xtset firm-.

As for the clustering accomplished by specifying -vce(cluster varname), this works by adjusting the standard errors to account for the fact that within levels of variable varname (be it firm or country, or anything else) there is lower variance than there is across levels of variable varname. You can choose varname to be the same as the panel variable in -xtset-, or you can choose it to be a higher level variable. Since it is certainly plausible that there will be lower variance of outcome within countries than across countries, and since your firms are nested within countries, it makes sense here to specify -vce(cluster country)-.

So, in summary, I would recommend

Code:

xtset firm year xtreg outcome i.sin_industry /*other covariates*/, fe vce(cluster country)

So, the -xtset- uses firm, which is the lowest level within which nesting occurs, and the -vce- uses country which is the highest level within which there is likely to be some reduction of variance due to similarity.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#27

18 Feb 2017, 19:51

Added to above:

How many countries are there in your data set? The cluster robust standard error is not valid for small numbers of clusters. There is some disagreement about just how many is enough to use it. But if you only have, say 5 or 10 countries, then you should not use -vce(cluster country)-. If you have dozens of them you are fine. In between is a grey zone.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#28

18 Feb 2017, 21:35

Oh, and in #26, I mean -xtreg, re-, not -fe- in the code. Because, of course, you won't get an estimate of the industry level variable sin_industry if you use -fe.
Comment
Krissy Philips

Join Date: Nov 2016

Posts: 63
#29

20 Feb 2017, 05:36

Originally posted by Clyde Schechter View Post

Added to above:

How many countries are there in your data set? The cluster robust standard error is not valid for small numbers of clusters. There is some disagreement about just how many is enough to use it. But if you only have, say 5 or 10 countries, then you should not use -vce(cluster country)-. If you have dozens of them you are fine. In between is a grey zone.

Thanks for the very helpful advice Clyde. Indeed that is the estimation I have settled on. Re: clustering, Clustering robust gives the same standard errors as clustering erros by company, which is to be expected. However, as you state - I only have 8 countries, therefore I should stick to clustering by companies (11000 companies!).
Comment
Krissy Philips

Join Date: Nov 2016

Posts: 63
#30

20 Feb 2017, 08:27

Forgot to mention, can I ask what is the econometric rational between not clustering at a level of 5-10 clusters?
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment