Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Correct regress code very likely changed into a wrong xtreg code


    Height predicting weight across 3 age groups

    Code: regress weight age1 age2 height age1ht age2ht
    1 Outputtable: _cons (5.601677)

    sort age
    by age: regress weight height
    3 outputtables, 1 for every agegroup, 3th outputtable for age3: _cons (5.601677)

    Source of the above codes: http://www.ats.ucla.edu/stat/stata/faq/compreg3.htm

    I was able to replicate the same method for my data. However, when I started to use xtreg to account for fixed effects with the following 2 codes, the _cons aren't equal anymore.

    xtreg weight age1 age2 height age1ht age2ht, fe (1 outputtable)

    sort age
    by age: xtreg weight height, fe (3 outputtables)

    The regression outputs of _cons of the above 2 codes aren't equal
    Could you tell me what the difference is between the 2 _cons of the 2 -xtreg- codes?

    Even without the option -fe- there's a difference between the 2 -xtreg- codes based on _cons.
    Last edited by Diante Fielding; 31 Jul 2014, 17:46.

  • #2
    To make it easier and more understandable:

    Why is there a difference between the 2 _cons of the 2 -xtreg- codes?
    I would like to know that because I want to check my single outputtable of the -xtreg- code with an other code

    I've got the feeling that there's an easy explanation for the difference, however, I cannot find it online.

    Comment


    • #3
      Wild guess (hopefully somebody who knows the real answer will chime in):

      With an fe model, you are controlling for the effects of time invariant variables with time invariant effects. When you run separate models, then those time invariant effects can be different for each age group. But when you run a single model with interaction effects, the effects of the time invariant omitted variables have to be the same for each age group.

      So, with the non-xt data, it would be like you had compared

      Code:
      regress weight age1 age2 height age1ht age2ht gender
      bysort age: regress weight height gender
      In the first regression command, the effect of gender is constrained to be the same for each age group, even though the effects of other variables can differ by age. But in the 2nd command, the effect of gender can differ by age, so the two sets of models are not equivalent to each other.

      Again, just wild guess. I'd be curious to hear what others think.
      -------------------------------------------
      Richard Williams, Notre Dame Dept of Sociology
      Stata Version: 17.0 MP (2 processor)

      EMAIL: [email protected]
      WWW: https://www3.nd.edu/~rwilliam

      Comment


      • #4
        Nice suggestion Richard. I was also thinking about the same thing. Your explanation could explain why the 2 _cons of the 2 -xtreg, fe- don't match with each other.

        However, even without the option -fe- there's a difference between the 2 -xtreg- codes based on the 2 _cons which are the same with the normal -reg- codes.

        Comment


        • #5
          Also waiting for the full answer, but thinking about it this way might add further insight.

          The random intercept model fit by xtreg (with default option re) can be viewed as a somehow 'optimal weighted average' between the within estimator (fe option) and the between estimator (be option).

          The simple linear model for cross-sectional data estimated by OLS is conceptually equivalent to the between estimator. After all, in a cross-sectional dataset we only observe between (individual) variance, so we cannot possibly use something else in the estimation.

          If, in the panel data case, the within variance was 0, then OLS and the between estimator would give the answers.

          Applying the between estimator to Diante's dataset (and therefore assuming the within variance to be 0), we will find the constant terms to be equivalent - just as in the OLS example.

          I believe Richards guess also applies to the random coefficient model. We might not fully control for the unit effects there, but they stay part of the models we estimate, and in fact we decompose the estimated residual variance into a within (i.e. unit-specific) and between part.

          Best
          Daniel

          Comment


          • #6
            I have two more things to point out.

            First, note that Richards comment applies to any additional predictor in the model, not just the unit-specific ones. If we have more predictors than those who are interacted, the two models will estimate different constant terms, since any predictor is constrained to have the same effect across groups in the model including an interaction term.

            Second, refreshing our memories by reading the manual entry on xt we can elaborate more on the answer. As mentioned, the random intercept model is a weighted average of the within and between estimates. It is estimated by running OLS on a transformed dataset. The transformation is

            \[
            x_{it}^\ast = x_{it} - \theta * \bar{x_i}
            \]


            where \(x_{it}\) includes the constant. If \(\theta\) differs among groups, the transformed constant also does. Therefore, we should not expect the constants from the models to be the same.

            Best
            Daniel
            Last edited by daniel klein; 01 Aug 2014, 04:42.

            Comment


            • #7
              Thanks Daniel. I am glad to see that my wild intuitive guess can apparently be backed up with a bit of math.

              Part of what made me think of this -- there are other instances where a model (e.g. ologit) with, say, everything interacted with gender, does not give the same results as separate models for each gender.

              Code:
              webuse nhanes2f, clear
              ologit health weight i.female i.female#c.weight, nolog
              bysort female: ologit health weight, nolog
              When you run separate models, the cutpoints can differ by gender. But, when you run a model with interactions, the cutpoints have to be the same for each gender.

              I think xtreg is similar, but it is less obvious what across-group constraints are still being applied even when you allow for interactions.
              -------------------------------------------
              Richard Williams, Notre Dame Dept of Sociology
              Stata Version: 17.0 MP (2 processor)

              EMAIL: [email protected]
              WWW: https://www3.nd.edu/~rwilliam

              Comment

              Working...
              X