xthtaylor error &-xtreg, fe- doesn't work due to important time-invariant variable

Victoria Rogers

Join Date: Oct 2014

Posts: 138
#1

xthtaylor error &-xtreg, fe- doesn't work due to important time-invariant variable

30 Oct 2014, 11:03

Based on a Hausman/test I have to choose a fixed model instead of a random effects model. I cannot use -xtreg, fe- because I have the time-invariant variable Male (and in another regression MBA...but the possible solution for Male is probably also usable for MBA). When I use -xtreg, re- I do not get a significant result for Male (neither for MBA)

Therefore, I decided to use -xthtaylor- because that seems to be the best solution. However, there's only 1 not so clear example on the Internet (I searched for about 30-45 minutes for another example).

Based on the example of https://kb.iu.edu/d/bcfo the variable -ed- is the endogenous time-invariant regressor but when I look at that data, -ed- seems to change over time.

In my case, Male (gender) (and MBA) do not change over time and my other independent variables do change over time (I assume they're endogenous time-varying variables based on the error below)

Code:

xthtaylor alpha MRP SMB HML MOM Male, endog(Male) constant(Male) xthtaylor alpha MRP SMB HML MOM Male, endog(MRP SMB HML MOM) constant(Male)

"There are no time-varying exogeneous variables in the model.
If you have those variables specified, they may have been removed because of collinearity."

Hopefully someone can help me.

Kind regards,

Victoria

Last edited by Victoria Rogers; 30 Oct 2014, 11:15.
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#2

30 Oct 2014, 11:23

Victoria wrote:

When I use -xtreg, re- I do not get a significant result for Male ...

I am not clear with why Victoria seems to worry for not getting a (statistical?) significant result for the predictor Male under -xtreg, re-.
A long-quoted occasional statistics note by Altman DG, Bland M. Absence of evidence is not evidence of absence. BMJ 1995; 311: 485 may provide some relief.
Besides, is the impossibility to get an estimate for a time-invariant predictor such as male under -xtreg, fe- a good theoretical reason for switching to -xtreg, re -?
Eventually, I suspect that with-xthaylor- things are going to be much more difficult to manage methodologically, unless Victoria has a strong background in econometrics.

Kind regards,
Carlo

Kind regards,
Carlo
(Stata 19.0)
Comment
daniel klein

Join Date: Mar 2014

Posts: 3860
#3

30 Oct 2014, 11:42

Based on the example of https://kb.iu.edu/d/bcfo the variable -ed- is the endogenous time-invariant regressor but when I look at that data, -ed- seems to change over time.

This claim is not true. Victoria should not "look" at the data, but let Stata do this for her.

Code:

webuse psidextract bys id (t) : assert ed[_n] == ed[1]

indicates that indeed ed does not change over time within each individual.

There should also be no need to be searching the internet for examples, as following the link in

Code:

help xthtaylor

to the manual entry provides illustrating examples along with the formal statistical background.

Best
Daniel
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2595
#4

30 Oct 2014, 11:51

There are two issues here, one regarding your model specification and another one regarding Stata:

1. The two regression commands in your example are based on almost opposite assumptions. In the first case, all time-varying regressors (alpha MRP SMB HML MOM) are treated as exogenous, and your only time-invariant regressor (Male) is treated as endogenous. In the second case, all time-varying regressors (MRP SMB HML MOM) but alpha, are now treated as endogenous, while Male is now considered to be exogenous. Which specification do you actually want to estimate?

2. The unfortunate issue with the xthtaylor command is that it requires at least one variable in each of the four categories endogenous time-varying, exogenous time-varying, endogenous time-invariant, exogenous time-invariant. This is too restrictive because the Hausman-Taylor approach actually allows for cases where some of these subsets are empty as long as their are enough exogenous time-varying regressors to instrument the endogenous time-invariant regressors. It would be nice if this could be fixed in a future Stata version!

It should be possible to circumvent this problem by using the xtivreg command and manually constructing the respective instruments, I suppose, because the Hausman-Taylor estimator is essentially an instrumental variables estimator with specific internal instruments. However, I quickly tried it but failed getting equivalent results with some test data. I would need to spend more time to figure out what is going here.

https://www.kripfganz.de/stata/
1 like
Comment
Victoria Rogers

Join Date: Oct 2014

Posts: 138
#5

30 Oct 2014, 11:52

Originally posted by Carlo Lazzaro View Post

Victoria wrote:

I am not clear with why Victoria seems to worry for not getting a (statistical?) significant result for the predictor Male under -xtreg, re-.
A long-quoted occasional statistics note by Altman DG, Bland M. Absence of evidence is not evidence of absence. BMJ 1995; 311: 485 may provide some relief.
Besides, is the impossibility to get an estimate for a time-invariant predictor such as male under -xtreg, fe- a good theoretical reason for switching to -xtreg, re -?
Eventually, I suspect that with-xthaylor- things are going to be much more difficult to manage methodologically, unless Victoria has a strong background in econometrics.

Kind regards,
Carlo

That's a problem because it's well-known that almost all researchers find a significant effect when it's about "gender <->stock returns". Besides that, my boss expects a significant result. Carlo, I don't have a background in econometrics, only finance. So, 1 of my biggest problems is the difference between endogeneous and exogeneous variables, which isn't very clear at all. (Unfortunately, I couldn't find a list of a few typical endogeneous and exogeneous which would make it a lot easier for me to use -xthtaylor- in case I use control variables) By the way, can I use i.year as a time-varying exogeneous variable in this case?

I just searched again on the Internet for more than 30 minutes, so far my only option seems to be -xtreg, re- even though Hausman recommends a fixed model (I know that you cannot base the choice of random versus fixed 100% on the Hausman-test)

Hopefully someone can help me.

Kind regards,

Victoria

EDIT: I added 2 codes in my first post, because I'm not sure which 1 code is the correct code in this case

@Daniel, after re-checking the data I see that -ed- is indeed time-invariant but my problem still remains and like I said, I've already checked almost all publicly available information, so including -help xthtaylor- which gives the exact same example as the one I provided with the website of my first post.
help xthtaylor

Last edited by Victoria Rogers; 30 Oct 2014, 12:23.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#6

30 Oct 2014, 12:18

Victoria:
- you can include i.year as a predictor if year is not the time identifier in -xtset-;
- it is true that you cannot base the choice of random vs fixed effect models 100% on Hausman specification test. However, if -xtreg, fe- is the right specifiction and you go -xtreg, re- your estimates will be inconsistent;
- the fact that the male predictor is not significant may be due to different causes related to the sample you're analyzing or your model specification.

Kind regards,
Carlo

Kind regards,
Carlo
(Stata 19.0)
Comment
Victoria Rogers

Join Date: Oct 2014

Posts: 138
#7

30 Oct 2014, 12:33

Originally posted by Sebastian Kripfganz View Post

There are two issues here, one regarding your model specification and another one regarding Stata:

1. The two regression commands in your example are based on almost opposite assumptions. In the first case, all time-varying regressors (alpha MRP SMB HML MOM) are treated as exogenous, and your only time-invariant regressor (Male) is treated as endogenous. In the second case, all time-varying regressors (MRP SMB HML MOM) but alpha, are now treated as endogenous, while Male is now considered to be exogenous. Which specification do you actually want to estimate?

2. The unfortunate issue with the xthtaylor command is that it requires at least one variable in each of the four categories endogenous time-varying, exogenous time-varying, endogenous time-invariant, exogenous time-invariant. This is too restrictive because the Hausman-Taylor approach actually allows for cases where some of these subsets are empty as long as their are enough exogenous time-varying regressors to instrument the endogenous time-invariant regressors. It would be nice if this could be fixed in a future Stata version!

It should be possible to circumvent this problem by using the xtivreg command and manually constructing the respective instruments, I suppose, because the Hausman-Taylor estimator is essentially an instrumental variables estimator with specific internal instruments. However, I quickly tried it but failed getting equivalent results with some test data. I would need to spend more time to figure out what is going here.

I'm not sure about how -xthtaylor- works, therefore I tried those 2 regression commands but both commands result in the error: "There are no time-invariant exogenous variables in the model.
If you have those variables specified, they may have been removed because of collinearity. r(198);" So, in the second case/command Male cannot be considered to be exogenous unless Stata sees Male as a time-varying exogenous variable in the 2nd case.

Thank you for the advice and the suggestion. However, if you cannot work with -xtivreg- then I'm 99% sure that I cannot work with it either because my Stata knowledge is limitied to the basics like this Hausman-test.

So, more suggestions, to solve my problem, are welcome

EDIT: Thank you Carlo.
-I'm using a combination of monthyear for -xtset-. Can I still use i.month and i.year in all my regressions? (and when I use -xthtaylor-? By the way, are i.year and i.month endogenous or exogenous?
-My important time-invariant variable Male would be dropped if I use -xtreg, fe- and -xthtaylor- doesn't seem to be possible in my case, so which fixed model would you advise?
-You've got me there. That's indeed true and probably the case because when I use a different method (with t-tests) I get a significant result. However, I still want to get the best possible result in this case and I have to control for omitted variable bias and using a fixed model would do that a lot better than a random model (at least in my case, because I'm a beginner with a low R-squared of 0.1)

Last edited by Victoria Rogers; 30 Oct 2014, 13:08.
Comment
Victoria Rogers

Join Date: Oct 2014

Posts: 138
#8

30 Oct 2014, 14:25

Everyone may answer my few questions, of course.

I'm probably not the first and not the last person with this problem. So, it would also help other people in the future.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2595
#9

30 Oct 2014, 15:22

Vitoria: If you want to allow all time-varying regressors to be correlated with the unobserved effects (the fixed-effects case), but are willing to assume that the time-varying regressor is uncorrelated with the effects you can do the following (assuming that the variable ID is your panel identifier):

Code:

global varying "alpha MRP SMB HML MOM" global invariant "Male" // generate time averages of all time-varying regressors foreach variable of varlist $varying { by ID: egen `variable'_mean = mean(`variable') } // standard fixed-effects estimation ($invariant will be dropped) xtreg $varying $invariant, fe // random-effects estimation augmented by time-averages of time-varying regressors xtreg $varying $invariant *_mean, re

You will observe that the coefficients of the time-varying regressors will be exactly the same in both regressions. That means, even though you estimate the second specification with the random-effects estimator, you will obtain the fixed-effects estimates for them. In addition, you will get a random-effects estimate for the time-invariant regressor.

This is the essentially the Hausman-Taylor principle if all time-varying regressors are endogenous (with respect to the unobserved effects) and all time-invariant regressors are exogenous.

https://www.kripfganz.de/stata/
3 likes
Comment
Victoria Rogers

Join Date: Oct 2014

Posts: 138
#10

30 Oct 2014, 16:25

Thank you very much Sebastian!

Just to be extra sure, even though I'll get a random-effects estimate for the time-invariant regressor (my most important variable), does that still mean that I control for all fixed effects with your code? Then I could tell in my report that I used a fixed model instead of a random effects model.

(1 extra question, -I'm using a combination of monthyear for -xtset-. Can I still use i.month and i.year in all my regressions? (By the way, are i.year and i.month endogenous or exogenous?)
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2595
#11

30 Oct 2014, 17:44

Thanks for asking about the time effects. I should have mentioned them before. You can simply add i.month and i.year to your regression but without their averages (even though they are time-varying). Their averages would be all the same and collinear with the regression constant.

I would not just say that you estimated a fixed effects model because people would wonder how you obtained an estimate for the coefficient of the time-invariant regressor. This would lead to confusion. Rather say that you are using a mixed fixed-effects/random-effects model, where the effects are allowed to be correlated with the time-varying regressors but are uncorrelated with the time-invariant regressors.

https://www.kripfganz.de/stata/
1 like
Comment
Victoria Rogers

Join Date: Oct 2014

Posts: 138
#12

30 Oct 2014, 18:45

Thank you again for the great advice. I really appreciate it! Could you rephrase this part please: "where the effects are allowed to be correlated with the time-varying regressors but are uncorrelated with the time-invariant regressors." I think that I understand it but I probably cannot explain that to my boss. Is it correct to say that with your code I'm controlling for all fixed effects while still being able to get the coefficient and significance of the time-invariant variable called Male (by using a mixed fixed-effects/random-effects model)? And that I included i.month and i.year to also control a little bit for the random effect?

By the way, why are you talking about the averages . In my case monthyear is just 1990m7 1990m8 etc. so nothing about averages.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2595
#13

30 Oct 2014, 21:24

In a strict econometric sense, the "fixed effects" are no longer fixed effects here. (I would maybe call them "individual-specific unobserved effects" instead.) But that is just terminology. I think you will be fine by explaining it to your boss the way you suggested.

The month-year dummy equals one in that particular month and zero otherwise. Its average is therefore 1/T (where T is the dime dimension). But never mind, just do not worry about the averages here. With the month-year dummies you are controlling for time-fixed effects.

Last edited by Sebastian Kripfganz; 30 Oct 2014, 21:33.

https://www.kripfganz.de/stata/
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#14

31 Oct 2014, 00:56

Victoria wrote:

...when I use a different method (with t-tests) I get a significant result...

.

This is not surprising, because -ttest- results are not adjusted for other covariates (like multipe regression models do),
I would stress this difference in the report to be provided to your boss.

Kind regards,
Carlo

Kind regards,
Carlo
(Stata 19.0)
Comment
daniel klein

Join Date: Mar 2014

Posts: 3860
#15

31 Oct 2014, 03:08

Sebastian's suggestion is basically what is called a correlated random effects model and Victoria can read more about these models in Schunk (2013).

Best
Daniel

Schunck, Reinhard (2013). Within and between estimates in random-effects models: Advantages and drawbacks of correlated random effects and hybrid models. The Stata Journal, 13(1): pp. 65-76.
(http://www.stata-journal.com/article...article=st0283)
Comment

Announcement

xthtaylor error &-xtreg, fe- doesn't work due to important time-invariant variable

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment