Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • What is the meaning of the constant using "reg y i.id i.jd" and "xtreg y i.jd ,fe"?

    Hello everyone,

    I am wondering what is the meaning of the constant using xtreg y i.jd ,fe and reg y i.id i.jd.

    If there is only one group, e.g. group id1-3
    Using reg y i.id, the constant 3.5 seems to be the average of y in id1
    Click image for larger version

Name:	2023-04-07 下午4.34.11.png
Views:	1
Size:	97.1 KB
ID:	1709138


    Using xtreg y, fe, the constant 3.433333 seems to be the grand average of y
    Click image for larger version

Name:	2023-04-07 下午4.40.30.png
Views:	1
Size:	122.5 KB
ID:	1709139




    If there are two groups, e.g. group id1-3, group jd1-5
    What is the meaning of the constant 3.566667 using reg y i.id i.jd?
    Click image for larger version

Name:	2023-04-07 下午4.44.56.png
Views:	1
Size:	130.5 KB
ID:	1709140


    What is the meaning of the constant 3.5 using xtreg y i.jd, fe?
    Click image for larger version

Name:	2023-04-07 下午4.45.14.png
Views:	1
Size:	162.1 KB
ID:	1709141



    Below is the average of y in different combinations of id and jd. Thank you.
    Click image for larger version

Name:	2023-04-07 下午5.23.55.png
Views:	1
Size:	117.8 KB
ID:	1709142

    Last edited by Ruby Yao; 10 Apr 2023, 01:18.

  • #2
    Ruby:
    the very essential answer is that -_cons- in -regress- and -xtreg,fe- are totally different beasts.
    See: https://stats.oarc.ucla.edu/stata/fa...d-by-xtreg-fe/.
    In addition, skimming through your post, the reason why screenshots are deprecated by the FAQ becomes apparent.
    Last edited by Carlo Lazzaro; 10 Apr 2023, 01:30.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Dear Carlo,

      Thank you very much for your answer and your reference. Sorry for the screenshots. I did not check the FAQ carefully and will revise them.

      I have read the page you recommend. I am still slightly confused about the -cons- in the two-groups scenario, e.g. -reg y i.id i.jd- and -xtreg y i.jd ,fe-. Is there any way to interpret the -_cons- in the two-groups scenario?

      Comment


      • #4
        Dear Carlo,

        Thank you very much for your answer and your reference. Sorry for the screenshots. I did not check the FAQ carefully and will revise them.

        I have read the page you recommend. I am still slightly confused about the -cons- in the two-group scenario, e.g. -reg y i.id i.jd- and -xtreg y i.jd ,fe-. Is there any way to interpret the -_cons- in the two-group scenario?

        Comment


        • #5
          Ruby:
          1) -regress- constant refers to id==1 and jd==1. You can see it when calculatiing -predict- by hand.
          The following toy-example proves that fitted+ui from -xtreg,fe- equals fitted from -regress-.
          Again, the _cons from -xtreg,fe- has no relevance.
          Code:
          . use "https://www.stata-press.com/data/r17/nlswork.dta"
          (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
          
          . reg ln_wage i.idcode i.year if idcode<=3
          
                Source |       SS           df       MS      Number of obs   =        39
          -------------+----------------------------------   F(16, 22)       =      2.84
                 Model |  3.48635949        16  .217897468   Prob > F        =    0.0122
              Residual |  1.68937946        22  .076789976   R-squared       =    0.6736
          -------------+----------------------------------   Adj R-squared   =    0.4362
                 Total |  5.17573896        38  .136203657   Root MSE        =    .27711
          
          ------------------------------------------------------------------------------
               ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                idcode |
                    2  |  -.3898423   .1155629    -3.37   0.003     -.629505   -.1501795
                    3  |  -.4648596   .1120424    -4.15   0.000    -.6972212   -.2324979
                       |
                  year |
                   69  |    .208967   .3918928     0.53   0.599    -.6037689    1.021703
                   70  |  -.2747772   .3439816    -0.80   0.433    -.9881514    .4385969
                   71  |  -.3613911    .326316    -1.11   0.280    -1.038129    .3153467
                   72  |  -.2056973    .326316    -0.63   0.535    -.8824352    .4710406
                   73  |  -.0310461    .326316    -0.10   0.925     -.707784    .6456917
                   75  |   .0416271    .326316     0.13   0.900    -.6351107     .718365
                   77  |   .0358937    .326316     0.11   0.913    -.6408441    .7126316
                   78  |   .2433199    .326316     0.75   0.464    -.4334179    .9200578
                   80  |   .2726139    .326316     0.84   0.412    -.4041239    .9493518
                   82  |   .1747839   .3439816     0.51   0.616    -.5385903    .8881581
                   83  |   .2924489    .326316     0.90   0.380    -.3842889    .9691868
                   85  |   .3712589    .326316     1.14   0.267     -.305479    1.047997
                   87  |   .2960361    .326316     0.91   0.374    -.3807017     .972774
                   88  |   .3038639    .326316     0.93   0.362    -.3728739    .9806018
                       |
                 _cons |   1.958421   .2989038     6.55   0.000     1.338532    2.578309
          ------------------------------------------------------------------------------
          
          
          . list fitted_reg if idcode==3 & year==68
          
                 +----------+
                 | fitted~g |
                 |----------|
              1. | 1.493561 |
                 +----------+
          
          . . di 1.958421-.4648596
          1.4935614
          
          
          . xtreg ln_wage i.year if idcode<=3, fe
          
          Fixed-effects (within) regression               Number of obs     =         39
          Group variable: idcode                          Number of groups  =          3
          
          R-squared:                                      Obs per group:
               Within  = 0.5446                                         min =         12
               Between = 0.2670                                         avg =       13.0
               Overall = 0.3678                                         max =         15
          
                                                          F(14,22)          =       1.88
          corr(u_i, Xb) = -0.0356                         Prob > F          =     0.0897
          
          ------------------------------------------------------------------------------
               ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                  year |
                   69  |    .208967   .3918928     0.53   0.599    -.6037689    1.021703
                   70  |  -.2747772   .3439816    -0.80   0.433    -.9881514    .4385969
                   71  |  -.3613911    .326316    -1.11   0.280    -1.038129    .3153467
                   72  |  -.2056973    .326316    -0.63   0.535    -.8824352    .4710406
                   73  |  -.0310461    .326316    -0.10   0.925     -.707784    .6456917
                   75  |   .0416271    .326316     0.13   0.900    -.6351107     .718365
                   77  |   .0358937    .326316     0.11   0.913    -.6408441    .7126316
                   78  |   .2433199    .326316     0.75   0.464    -.4334179    .9200578
                   80  |   .2726139    .326316     0.84   0.412    -.4041239    .9493518
                   82  |   .1747839   .3439816     0.51   0.616    -.5385903    .8881581
                   83  |   .2924489    .326316     0.90   0.380    -.3842889    .9691868
                   85  |   .3712589    .326316     1.14   0.267     -.305479    1.047997
                   87  |   .2960361    .326316     0.91   0.374    -.3807017     .972774
                   88  |   .3038639    .326316     0.93   0.362    -.3728739    .9806018
                       |
                 _cons |   1.659677   .2833366     5.86   0.000     1.072073    2.247281
          -------------+----------------------------------------------------------------
               sigma_u |  .24956596
               sigma_e |  .27711004
                   rho |  .44784468   (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          F test that all u_i=0: F(2, 22) = 9.64                       Prob > F = 0.0010
          
          . predict fitted_xtreg, xb
          
          . predict fe_xtreg, u
          
          . list fitted_reg fitted_xtreg fe_xtreg if idcode==3 & year==68
          
                 +--------------------------------+
                 | fit~_reg   fit~treg   fe_xtreg |
                 |--------------------------------|
             25. | 1.493561   1.659677   -.166116 |
                 +--------------------------------+
          
          . di 1.659677 -.166116
          1.493561
          
          .
          Last edited by Carlo Lazzaro; 10 Apr 2023, 08:52.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            Dear Carlo,

            Thanks a lot for your help.

            Does it mean that the _cons from -reg y i.id i.jd- and -xtreg y i.jd , fe- both refer to the average fitted value of y when id==1 and jd==1?

            And I am wondering why is the _cons the average fitted value of y when id==1 and jd==1 rather than the average true value of y when id==1 and jd==1.

            I find that there is no observation when idcode==1 & year==68 in the nlswork.dta above. So I post my data here to illustrate my question.

            The dataset is as below.
            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input float(id jd y)
            1 1 3
            1 1 4
            1 2 4
            1 2 3
            1 3 6
            1 3 5
            1 4 3
            1 4 2
            1 5 2
            1 5 3
            2 1 5
            2 1 4
            2 2 3
            2 2 4
            2 3 2
            2 3 3
            2 4 5
            2 4 3
            2 5 4
            2 5 3
            3 1 2
            3 1 3
            3 2 4
            3 2 3
            3 3 3
            3 3 4
            3 4 2
            3 4 6
            3 5 3
            3 5 2
            end
            The average true value of y when id==1 and jd==1 is 3.5.
            Code:
            sum y if id==1&jd==1
            
                Variable |        Obs        Mean    Std. Dev.       Min        Max
            -------------+---------------------------------------------------------
                       y |          2         3.5    .7071068          3          4

            The average fitted value of y when id==1 and jd==1 from -reg y i.id i.jd- is 3.566667, different from the average true value. Why is the _cons from reg the average fitted value of y when id==1 and jd==1 rather than the average true value of y when id==1 and jd==1?
            Code:
            reg y i.id i.jd
            
                  Source |       SS           df       MS      Number of obs   =        30
            -------------+----------------------------------   F(6, 23)        =      0.47
                   Model |  4.06666667         6  .677777778   Prob > F        =    0.8247
                Residual |        33.3        23  1.44782609   R-squared       =    0.1088
            -------------+----------------------------------   Adj R-squared   =   -0.1236
                   Total |  37.3666667        29  1.28850575   Root MSE        =    1.2033
            
            ------------------------------------------------------------------------------
                       y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                      id |
                      2  |         .1   .5381126     0.19   0.854    -1.013171    1.213171
                      3  |        -.3   .5381126    -0.56   0.583    -1.413171    .8131708
                         |
                      jd |
                      2  |   9.45e-16   .6947004     0.00   1.000    -1.437097    1.437097
                      3  |   .3333333   .6947004     0.48   0.636    -1.103764    1.770431
                      4  |   9.44e-16   .6947004     0.00   1.000    -1.437097    1.437097
                      5  |  -.6666667   .6947004    -0.96   0.347    -2.103764    .7704307
                         |
                   _cons |   3.566667   .5812281     6.14   0.000     2.364305    4.769029
            ------------------------------------------------------------------------------
            
            predict fitted_reg,xb
            
            sum fitted_reg if id==1&jd==1
            
                Variable |        Obs        Mean    Std. Dev.       Min        Max
            -------------+---------------------------------------------------------
              fitted_reg |          2    3.566667           0   3.566667   3.566667
            The average fitted value of y when id==1 and jd==1 from -xtreg y i.jd, fe- is 3.5, the same as the average true value. Does it mean that the _cons from xtreg is not only the average fitted value but also the average true value of y when id==1 and jd==1?
            Code:
            xtset id
                   panel variable:  id (balanced)
            xtreg y i.jd,fe
            
            Fixed-effects (within) regression               Number of obs     =         30
            Group variable: id                              Number of groups  =          3
            
            R-sq:                                           Obs per group:
                 within  = 0.0877                                         min =         10
                 between = 0.0000                                         avg =       10.0
                 overall = 0.0856                                         max =         10
            
                                                            F(4,23)           =       0.55
            corr(u_i, Xb)  = 0.0000                         Prob > F          =     0.6991
            
            ------------------------------------------------------------------------------
                       y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                      jd |
                      2  |  -4.28e-16   .6947004    -0.00   1.000    -1.437097    1.437097
                      3  |   .3333333   .6947004     0.48   0.636    -1.103764    1.770431
                      4  |  -4.44e-16   .6947004    -0.00   1.000    -1.437097    1.437097
                      5  |  -.6666667   .6947004    -0.96   0.347    -2.103764    .7704307
                         |
                   _cons |        3.5   .4912274     7.13   0.000     2.483819    4.516181
            -------------+----------------------------------------------------------------
                 sigma_u |   .2081666
                 sigma_e |  1.2032565
                     rho |  .02906016   (fraction of variance due to u_i)
            ------------------------------------------------------------------------------
            F test that all u_i=0: F(2, 23) = 0.30                       Prob > F = 0.7442
            
            predict fitted_xtreg,xb
            
            sum fitted_xtreg if id==1&jd==1
            
            Variable |        Obs        Mean    Std. Dev.       Min        Max
            -------------+---------------------------------------------------------
            fitted_xtreg |          2         3.5           0        3.5        3.5

            Thank you.

            Comment


            • #7
              Ruby:
              you're seemingly ignoring the role of residuals:
              Code:
              . reg y i.id i.jd
              
                    Source |       SS           df       MS      Number of obs   =        30
              -------------+----------------------------------   F(6, 23)        =      0.47
                     Model |  4.06666667         6  .677777778   Prob > F        =    0.8247
                  Residual |        33.3        23  1.44782609   R-squared       =    0.1088
              -------------+----------------------------------   Adj R-squared   =   -0.1236
                     Total |  37.3666667        29  1.28850575   Root MSE        =    1.2033
              
              ------------------------------------------------------------------------------
                         y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
              -------------+----------------------------------------------------------------
                        id |
                        2  |         .1   .5381126     0.19   0.854    -1.013171    1.213171
                        3  |        -.3   .5381126    -0.56   0.583    -1.413171    .8131708
                           |
                        jd |
                        2  |   9.45e-16   .6947004     0.00   1.000    -1.437097    1.437097
                        3  |   .3333333   .6947004     0.48   0.636    -1.103764    1.770431
                        4  |   9.44e-16   .6947004     0.00   1.000    -1.437097    1.437097
                        5  |  -.6666667   .6947004    -0.96   0.347    -2.103764    .7704307
                           |
                     _cons |   3.566667   .5812281     6.14   0.000     2.364305    4.769029
              ------------------------------------------------------------------------------
              
              . predict fitted, xb
              
              . predict residual, res
              
              . list id jd y fitted residual if id==1&jd==1
              
                   +------------------------------------+
                   | id   jd   y     fitted    residual |
                   |------------------------------------|
                1. |  1    1   3   3.566667   -.5666667 |
                2. |  1    1   4   3.566667    .4333333 |
                   +------------------------------------+
              
              . xtset id
              
              Panel variable: id (balanced)
              
              . 
              .        panel variable:  id (balanced)
              command panel is unrecognized
              r(199);
              
              . 
              . xtreg y i.jd,fe
              
              Fixed-effects (within) regression               Number of obs     =         30
              Group variable: id                              Number of groups  =          3
              
              R-squared:                                      Obs per group:
                   Within  = 0.0877                                         min =         10
                   Between = 0.0000                                         avg =       10.0
                   Overall = 0.0856                                         max =         10
              
                                                              F(4,23)           =       0.55
              corr(u_i, Xb) = 0.0000                          Prob > F          =     0.6991
              
              ------------------------------------------------------------------------------
                         y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
              -------------+----------------------------------------------------------------
                        jd |
                        2  |  -4.28e-16   .6947004    -0.00   1.000    -1.437097    1.437097
                        3  |   .3333333   .6947004     0.48   0.636    -1.103764    1.770431
                        4  |  -4.44e-16   .6947004    -0.00   1.000    -1.437097    1.437097
                        5  |  -.6666667   .6947004    -0.96   0.347    -2.103764    .7704307
                           |
                     _cons |        3.5   .4912274     7.13   0.000     2.483819    4.516181
              -------------+----------------------------------------------------------------
                   sigma_u |   .2081666
                   sigma_e |  1.2032565
                       rho |  .02906016   (fraction of variance due to u_i)
              ------------------------------------------------------------------------------
              F test that all u_i=0: F(2, 23) = 0.30                       Prob > F = 0.7442
              
              . predict fitted_xtreg, xb
              
              . predict u_error_xtreg, u
              
              . predict e_error_xtreg, e
              
              
              . list id jd y fitted_xtreg u_error_xtreg e_error_xtreg if id==1&jd==1
              
                   +-----------------------------------------------+
                   | id   jd   y   fitted~g   u_erro~g   e_error~g |
                   |-----------------------------------------------|
                1. |  1    1   3        3.5   .0666667   -.5666667 |
                2. |  1    1   4        3.5   .0666667    .4333333 |
                   +-----------------------------------------------+
              
              .
              If you take a look at observed, fitted and residuals, after a bit of algebra, the results of -regress- with -i.id-among the predictors, and -xtreg,fe- overlap (as expected).
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment

              Working...
              X