Include ommited variables in estimates

Charlie Joyez

Join Date: Dec 2014

Posts: 421
#1

Include ommited variables in estimates

25 Sep 2018, 00:42

Dear Stata users,

I'm running several sub-sample regressions over a factor variable, and would like to create a matrix containing the estimates vectors of each regresssion.
However, when my subsample doesn't include enough observations for a category of the explicative variable, it doesn't appear on the estimates.
The vectors I recover are therefore not directly comparable since the xth line of one doesn't necessarly corresponds to the same variable in each.

Here's an example.

Code:

sysuse auto2.dta,clear reg price ib5.rep78 if foreign==0 matrix b0=e(b)' reg price ib5.rep78 if foreign==1 matrix b1=e(b)' matrix list b0 matrix list b1 matrix M=b0,b1

As you see the two matrices don't have the same dimensions, because the categories1 and 2 of rep78 are omitted in the second regression
Therefore, the final line causes an error, and the coefficient associated with category 3 is in 3rd row b0, but in the first in b1.

I'd like to add two rows of missing values in the first two rows of b1, to make the matrix comparable and of the same dimension.

Thanks a lot,
Charlie
Tags: None

Andrew Musau

Join Date: Oct 2014
Posts: 10221

25 Sep 2018, 05:27

Code:

matrix t = J(2,1,.)
mat b2= t\b1
mat M= b0, b2

Code:

. matrix M=b0,b2

. mat list M

M[6,2]
                  y1          c1
 1.rep78         360           .
 2.rep78    1763.125           .
 3.rep78   2402.5741       -1464
 4.rep78   1677.0556  -31.222222
5b.rep78           0           0
   _cons      4204.5   6292.6667

If missing rows are not consecutive, you will need to append several matrices.

Last edited by Andrew Musau; 25 Sep 2018, 05:34.

Comment

Charlie Joyez

Join Date: Dec 2014

Posts: 421
#3

25 Sep 2018, 05:55

Thanks Andrew,
Pretty smart and easy fix indeed!

However, that's not entirely satisfying because my matrix will be 402*402 and nothing ensures the missing rows to be consecutive, so it still implies a lot of work that I'd like to be automatized.

I'm thinking about recovering each _b[X] after each regression and assign them to the Xth row of a vector, and then append the 402 vectors, but still struggling with missing values (when _b[X] doesn't exist)

Thanks anyway
Best,
Charlie
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10221

25 Sep 2018, 06:12

I would use esttab (Stata Journal; Ben Jann) and retrieve the coefficients matrix

Code:

sysuse auto, clear
eststo: qui reg price mpg i.rep78 weight
eststo: qui reg price mpg weight
qui esttab
mat list r(coefs)

Code:


. mat list r(coefs)

r(coefs)[8,6]
               est1:       est1:       est1:       est2:       est2:       est2:
                  b           t           p           b           t           p
    mpg  -63.097096  -.72149918   .47331463  -49.512221  -.57468079   .56732373
1.rep78           0           .           .          .z          .z          .z
2.rep78   753.70237   .39260175   .69596018          .z          .z          .z
3.rep78    1349.361   .76118707   .44943089          .z          .z          .z
4.rep78     2030.47   1.1217511   .26629404          .z          .z          .z
5.rep78   3376.9103   1.7771624   .08044658          .z          .z          .z
 weight   2.0930663    3.286329   .00167346   1.7465592   2.7232382   .00812981
  _cons   -598.9665  -.15121963   .88029342   1946.0687   .54101802   .59018863

Here, you can just keep the estimates if you don't need the t-statistics and p-values.

Comment

Charlie Joyez

Join Date: Dec 2014

Posts: 421
#5

25 Sep 2018, 06:36

Thanks for the suggestion, It seems to work (with an extraction afterwards)
Comment

Announcement

Include ommited variables in estimates

Comment

Comment

Comment

Comment