Variance of Fixed effect residuals

Federico Nutarelli

Join Date: Sep 2018
Posts: 430

Variance of Fixed effect residuals

29 Oct 2019, 15:54

Hi all, I am performing the following fixed effect regression on an unbalanced panel model:

Code:

quietly xtreg tasso_crescita_sales_prod L.log_sales L.dummy_2 L2.dummy_2 L3.dummy_2 mean_gr_rate_atc2 recalls_sales ageprodcat1 ageprodcat2 ageprodcat3 ageprodcat4 newmolfirm newmolmarket i.Year, fe vce(cluster idpr)

I would like to compute the variance of all the residuals and store It into a variable.

Code:

e(sigma_e) // or e(sigma_u)

do not seem to do what I want...specifically: I constructed a variable consisting of the residuals of the estimation (variable av_it):

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(idproduct Year av_it)
  8 2004          .
  8 2015          .
 22 2004          .
 22 2005          .
 22 2006          .
 22 2007          0
 22 2008          .
 22 2009          .
 22 2010          .
 22 2011          .
 41 2004          .
 41 2005          .
 41 2006          .
 41 2007    .303154
 41 2008 -.14690971
 41 2009 .006593704
 41 2010   .2429447
 41 2011  -.1774826
 41 2012  -.8590889
 41 2013  .12342644
 41 2014  .50736046
 41 2015          .
 44 2005          .
 44 2006          .
 44 2007          .
 44 2008  -.9983988
 44 2009  .12073898
 44 2010  -.5993109
 44 2011          .
 44 2012 -1.4308786
 44 2013    1.45887
 44 2014  1.4698343
 44 2015 -.02085495
 62 2004          .
 62 2005          .
 62 2006          .
 62 2007          .
 62 2008          .
 62 2009          .
 62 2010          .
 62 2011          .
 62 2012          .
 62 2013          .
 62 2014          .
 62 2015          0
 99 2004          .
 99 2005          .
 99 2006          .
 99 2007          .
 99 2008          .
 99 2009          .
 99 2010          .
107 2004          .
107 2005          .
107 2006          .
107 2007          .
107 2008          .
107 2009   2.132491
107 2010          .
107 2011          .
107 2012          .
107 2013          .
107 2014          .
107 2015  -2.132491
108 2004          .
108 2005          .
108 2006          .
108 2007          .
108 2008          .
108 2009          .
108 2010          .
108 2011          .
108 2012          .
108 2013          .
108 2014          .
108 2015          .
114 2004          .
114 2005          .
114 2006          .
114 2007  -.7792645
114 2008  .26324368
114 2009   .1756668
114 2010   .4246168
114 2011  -.0842619
114 2012          .
114 2013          .
114 2014          .
114 2015          .
130 2004          .
130 2005          .
130 2006          .
130 2007   .6940765
130 2008  -.6289883
130 2009   .6858149
130 2010          .
130 2011  -.4151192
130 2012   -.335784
130 2013          .
130 2014          .
130 2015          .
end

What I would like to do is to compute Var(av_it) and then average it over time.

Last edited by Federico Nutarelli; 29 Oct 2019, 16:39.

Tags: None

Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#2

29 Oct 2019, 16:11

You may use - scalar - to save both values, then include them wherever you wish. I fail to understand the reason of saving a variable with just one (repeated) value in the whole dataset.

I’m not with my Stata now, hence this is untested, but I gather - gen myvar = e(sigma_u) - would do the trick as well.

Best regards,

Marcos
Comment
Federico Nutarelli

Join Date: Sep 2018

Posts: 430
#3

29 Oct 2019, 16:15

Thank you Marcos for the replay,

I fail to understand the reason of saving a variable with just one (repeated) value in the whole dataset.

indeed I would like to store a value for each one of the residuals. That's what I am struggling to do
Comment

Marcos Almeida

Join Date: Apr 2014
Posts: 4047

30 Oct 2019, 05:19

I do understand "what" you want to get. But I do not understand "why" you wish to create a variable with all observations having the same values. That is the reason I recommended to use - scalar - instead.

That said, keep in mind that, if you need the variance you just have to square the SDs.

Last but not least, shall you wish to focus on the variance of the residuals, please keep in mind that they are some sort of "nuisance" under xtreg, whereas they thrive in hierarchical "mixed" models as well as structural equation models.

Please take a look at the example below, where I present the way to get both variables, the use of scalars and the estimations under both xtreg and mixed.

Hopefully that helps.

Code:

 

. webuse nlswork
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

. xtset idcode
       panel variable:  idcode (unbalanced)

. xtreg ln_wage union c.age

Random-effects GLS regression                   Number of obs     =     19,229
Group variable: idcode                          Number of groups  =      4,150

R-sq:                                           Obs per group:
     within  = 0.0955                                         min =          1
     between = 0.0504                                         avg =        4.6
     overall = 0.0610                                         max =         12

                                                Wald chi2(2)      =    1808.17
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       union |   .1305272   .0066639    19.59   0.000     .1174661    .1435883
         age |   .0147793    .000396    37.32   0.000     .0140031    .0155556
       _cons |    1.22913   .0139745    87.96   0.000      1.20174    1.256519
-------------+----------------------------------------------------------------
     sigma_u |  .38505558
     sigma_e |  .26213464
         rho |  .68331727   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. gen sig_u = e(sigma_u)

. gen sig_e = e(sigma_e)

. gen sig2_u = sig_u^2

. gen sig2_e = sig_e^2

. list sig_u sig_e sig2_u sig2_e in 1

     +-------------------------------------------+
     |    sig_u      sig_e     sig2_u     sig2_e |
     |-------------------------------------------|
  1. | .3850556   .2621346   .1482678   .0687146 |
     +-------------------------------------------+



. mixed ln_wage union c.age  || idcode:,

Performing EM optimization:

Performing gradient-based optimization:

Iteration 0:   log likelihood = -6157.0931  
Iteration 1:   log likelihood = -6157.0931  

Computing standard errors:

Mixed-effects ML regression                     Number of obs     =     19,229
Group variable: idcode                          Number of groups  =      4,150

                                                Obs per group:
                                                              min =          1
                                                              avg =        4.6
                                                              max =         12

                                                Wald chi2(2)      =    1806.54
Log likelihood = -6157.0931                     Prob > chi2       =     0.0000

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       union |   .1307709   .0066669    19.61   0.000     .1177039    .1438379
         age |   .0147737   .0003963    37.28   0.000      .013997    .0155504
       _cons |   1.229306   .0139682    88.01   0.000     1.201929    1.256683
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
idcode: Identity             |
                  var(_cons) |   .1471143   .0037929      .1398649    .1547393
-----------------------------+------------------------------------------------
               var(Residual) |   .0690614   .0007978      .0675153     .070643
------------------------------------------------------------------------------
LR test vs. linear model: chibar2(01) = 11681.89      Prob >= chibar2 = 0.0000

. mixed ln_wage union c.age  || idcode:, stddev

Performing EM optimization:

Performing gradient-based optimization:

Iteration 0:   log likelihood = -6157.0931  
Iteration 1:   log likelihood = -6157.0931  

Computing standard errors:

Mixed-effects ML regression                     Number of obs     =     19,229
Group variable: idcode                          Number of groups  =      4,150

                                                Obs per group:
                                                              min =          1
                                                              avg =        4.6
                                                              max =         12

                                                Wald chi2(2)      =    1806.54
Log likelihood = -6157.0931                     Prob > chi2       =     0.0000

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       union |   .1307709   .0066669    19.61   0.000     .1177039    .1438379
         age |   .0147737   .0003963    37.28   0.000      .013997    .0155504
       _cons |   1.229306   .0139682    88.01   0.000     1.201929    1.256683
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
idcode: Identity             |
                   sd(_cons) |   .3835548   .0049445      .3739852    .3933692
-----------------------------+------------------------------------------------
                sd(Residual) |   .2627954    .001518       .259837    .2657875
------------------------------------------------------------------------------
LR test vs. linear model: chibar2(01) = 11681.89      Prob >= chibar2 = 0.0000

. scalar var_u = sig2_u

. scalar var_e = sig2_e

. scalar list
     var_e =  .06871457
     var_u =  .14826779

Best regards,

Marcos

Comment

Marcos Almeida

Join Date: Apr 2014
Posts: 4047

30 Oct 2019, 05:28

In the example above, it's a random-effects models. This message is just to add that the command is the same for the fixed-effects models (as you wanted):

Code:

. xtreg ln_wage union c.age, fe

Fixed-effects (within) regression               Number of obs     =     19,229
Group variable: idcode                          Number of groups  =      4,150

R-sq:                                           Obs per group:
     within  = 0.0963                                         min =          1
     between = 0.0433                                         avg =        4.6
     overall = 0.0562                                         max =         12

                                                F(2,15077)        =     803.76
corr(u_i, Xb)  = 0.0127                         Prob > F          =     0.0000

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       union |   .1055274   .0070879    14.89   0.000     .0916342    .1194205
         age |   .0153507   .0004157    36.92   0.000     .0145358    .0161656
       _cons |   1.248435   .0132474    94.24   0.000     1.222468    1.274401
-------------+----------------------------------------------------------------
     sigma_u |  .42353003
     sigma_e |  .26213464
         rho |  .72302816   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(4149, 15077) = 10.12                Prob > F = 0.0000

. gen sig_u = e(sigma_u)

. gen sig_e = e(sigma_e)

. gen sig2_u = sig_u^2

. gen sig2_e = sig_e^2

. list sig_u sig_e sig2_u sig2_e in 1

     +-----------------------------------------+
     |  sig_u      sig_e     sig2_u     sig2_e |
     |-----------------------------------------|
  1. | .42353   .2621346   .1793777   .0687146 |
     +-----------------------------------------+

. scalar var_u = sig2_u

. scalar var_e = sig2_e
 
. scalar list var_u var_e
     var_u =  .17937769
     var_e =  .06871457

Best regards,

Marcos

Comment

Federico Nutarelli

Join Date: Sep 2018

Posts: 430
#6

30 Oct 2019, 08:02

That is very detailed explanation. Thank you a lot.
Well the point is I am trying to reach something like this:

Code:

gen var_avt=(av_t-r(mean))^2/(r(N)-1)

So a residual specific variance (whose specificity is given by the presence of "av_t"). So that's why I am trying to encode it in a variable . I think that

Code:

gen sig_e = e(sigma_e)

gives the overall residual standard deviation. I guess it is the standard deviation of the mean residual or something like that. My doubts with respect to

Code:

gen var_avt=(av_t-r(mean))^2/(r(N)-1)

are if I need to do it by(id) or by(time variable) or if it is fine to leave it as such.

Thanks again!

Last edited by Federico Nutarelli; 30 Oct 2019, 08:07.
Comment
Federico Nutarelli

Join Date: Sep 2018

Posts: 430
#7

30 Oct 2019, 09:22

Originally posted by Federico Nutarelli View Post

That is very detailed explanation. Thank you a lot.
Well the point is I am trying to reach something like this:

Code:

gen var_avt=(av_t-r(mean))^2/(r(N)-1)

So a residual specific variance (whose specificity is given by the presence of "av_t"). So that's why I am trying to encode it in a variable . I think that

Code:

gen sig_e = e(sigma_e)

gives the overall residual standard deviation. I guess it is the standard deviation of the mean residual or something like that. My doubts with respect to

Code:

gen var_avt=(av_t-r(mean))^2/(r(N)-1)

are if I need to do it by(id) or by(time variable) or if it is fine to leave it as such.

Thanks again!

Better, to quote "Econometric Analysis of Cross Section and Panel Datahttps://jrvargas.files.wordpress.com › 2011/01 › wooldridge_j-_2002_eco..." (pag. 271-272) what I would like to obtain are the (u_hat_it)^2 and not the (sigma_u)
Comment

Announcement

Variance of Fixed effect residuals

Comment

Comment

Comment

Comment

Comment

Comment