Correct regress code very likely changed into a wrong xtreg code

Diante Fielding

Join Date: Jul 2014

Posts: 29
#1

Correct regress code very likely changed into a wrong xtreg code

31 Jul 2014, 16:58

Height predicting weight across 3 age groups

Code: regress weight age1 age2 height age1ht age2ht
1 Outputtable: _cons (5.601677)

sort age
by age: regress weight height
3 outputtables, 1 for every agegroup, 3th outputtable for age3: _cons (5.601677)

Source of the above codes: http://www.ats.ucla.edu/stat/stata/faq/compreg3.htm

I was able to replicate the same method for my data. However, when I started to use xtreg to account for fixed effects with the following 2 codes, the _cons aren't equal anymore.

xtreg weight age1 age2 height age1ht age2ht, fe (1 outputtable)

sort age
by age: xtreg weight height, fe (3 outputtables)

The regression outputs of _cons of the above 2 codes aren't equal
Could you tell me what the difference is between the 2 _cons of the 2 -xtreg- codes?

Even without the option -fe- there's a difference between the 2 -xtreg- codes based on _cons.

Last edited by Diante Fielding; 31 Jul 2014, 17:46.
Tags: None
Diante Fielding

Join Date: Jul 2014

Posts: 29
#2

31 Jul 2014, 18:38

To make it easier and more understandable:

Why is there a difference between the 2 _cons of the 2 -xtreg- codes?
I would like to know that because I want to check my single outputtable of the -xtreg- code with an other code

I've got the feeling that there's an easy explanation for the difference, however, I cannot find it online.
Comment
Richard Williams

Join Date: Apr 2014

Posts: 5008
#3

31 Jul 2014, 19:05

Wild guess (hopefully somebody who knows the real answer will chime in):

With an fe model, you are controlling for the effects of time invariant variables with time invariant effects. When you run separate models, then those time invariant effects can be different for each age group. But when you run a single model with interaction effects, the effects of the time invariant omitted variables have to be the same for each age group.

So, with the non-xt data, it would be like you had compared

Code:

regress weight age1 age2 height age1ht age2ht gender bysort age: regress weight height gender

In the first regression command, the effect of gender is constrained to be the same for each age group, even though the effects of other variables can differ by age. But in the 2nd command, the effect of gender can differ by age, so the two sets of models are not equivalent to each other.

Again, just wild guess. I'd be curious to hear what others think.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Diante Fielding

Join Date: Jul 2014

Posts: 29
#4

31 Jul 2014, 19:41

Nice suggestion Richard. I was also thinking about the same thing. Your explanation could explain why the 2 _cons of the 2 -xtreg, fe- don't match with each other.

However, even without the option -fe- there's a difference between the 2 -xtreg- codes based on the 2 _cons which are the same with the normal -reg- codes.
Comment
daniel klein

Join Date: Mar 2014

Posts: 3862
#5

01 Aug 2014, 02:02

Also waiting for the full answer, but thinking about it this way might add further insight.

The random intercept model fit by xtreg (with default option re) can be viewed as a somehow 'optimal weighted average' between the within estimator (fe option) and the between estimator (be option).

The simple linear model for cross-sectional data estimated by OLS is conceptually equivalent to the between estimator. After all, in a cross-sectional dataset we only observe between (individual) variance, so we cannot possibly use something else in the estimation.

If, in the panel data case, the within variance was 0, then OLS and the between estimator would give the answers.

Applying the between estimator to Diante's dataset (and therefore assuming the within variance to be 0), we will find the constant terms to be equivalent - just as in the OLS example.

I believe Richards guess also applies to the random coefficient model. We might not fully control for the unit effects there, but they stay part of the models we estimate, and in fact we decompose the estimated residual variance into a within (i.e. unit-specific) and between part.

Best
Daniel
Comment
daniel klein

Join Date: Mar 2014

Posts: 3862
#6

01 Aug 2014, 03:50

I have two more things to point out.

First, note that Richards comment applies to any additional predictor in the model, not just the unit-specific ones. If we have more predictors than those who are interacted, the two models will estimate different constant terms, since any predictor is constrained to have the same effect across groups in the model including an interaction term.

Second, refreshing our memories by reading the manual entry on xt we can elaborate more on the answer. As mentioned, the random intercept model is a weighted average of the within and between estimates. It is estimated by running OLS on a transformed dataset. The transformation is

\[
x_{it}^\ast = x_{it} - \theta * \bar{x_i}
\]

where \(x_{it}\) includes the constant. If \(\theta\) differs among groups, the transformed constant also does. Therefore, we should not expect the constants from the models to be the same.

Best
Daniel

Last edited by daniel klein; 01 Aug 2014, 04:42.
Comment
Richard Williams

Join Date: Apr 2014

Posts: 5008
#7

01 Aug 2014, 06:12

Thanks Daniel. I am glad to see that my wild intuitive guess can apparently be backed up with a bit of math.

Part of what made me think of this -- there are other instances where a model (e.g. ologit) with, say, everything interacted with gender, does not give the same results as separate models for each gender.

Code:

webuse nhanes2f, clear ologit health weight i.female i.female#c.weight, nolog bysort female: ologit health weight, nolog

When you run separate models, the cutpoints can differ by gender. But, when you run a model with interactions, the cutpoints have to be the same for each gender.

I think xtreg is similar, but it is less obvious what across-group constraints are still being applied even when you allow for interactions.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment

Announcement

Correct regress code very likely changed into a wrong xtreg code

Comment

Comment

Comment

Comment

Comment

Comment