Suest and r-squared?

Maria Ventura

Join Date: Jun 2018

Posts: 40
#1

Suest and r-squared?

30 Aug 2021, 03:13

Hi,

Does anybody know if there is an option to show the r-squared with suest? And if not, is there a reason why it is not possible to?

Thanks,

Maria
Tags: postestimation, R-squared, suest, sur, sureg
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#2

30 Aug 2021, 04:05

Probably the easiest thing to do is to estimate the system using -sureg- (which will give you an R-squared) rather than -suest-.

Otherwise the R-squared here is non-standard, and I guess this is why -suest- does not report any R-squared.

I do not see a fundamental problem in defining an R-squared from the unrestricted residuals of the whole stacked system and the restricted residuals from the stacked system with only constants.
Comment

Maria Ventura

Join Date: Jun 2018
Posts: 40

30 Aug 2021, 07:05

Hi Joro, thanks! I was actually trying to use sureg + suregr following our previous conversation on a different post but I ran into something weird that made me prefer suest, which seems instead to go smoothly. I am not sure what I am doing wrong but basically sureg seem to give me the exact same results even when partly changing the sample. In particular:

Code:

qui  sureg (SOf female i.year i.doby, nocons) (SOm female i.year i.doby, nocons) [aw=wt]
esttab, b se keep(female)


----------------------------
                      (1)   
                      SOf   
----------------------------
SOf                         
female            -0.0457***
                (0.00225)   
----------------------------
SOm                         
female             0.0737***
                (0.00209)   
----------------------------
N                   68157   
----------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001


qui sureg (SOf female i.year i.doby if fa_occ_fixed!=100, nocons) (SOm female i.year i.doby if mo_occ_fixed!=100, nocons) [aw=wt]
esttab, b se keep(female)


----------------------------
                      (1)   
                      SOf   
----------------------------
SOf                         
female            -0.0457***
                (0.00225)   
----------------------------
SOm                         
female             0.0737***
                (0.00209)   
----------------------------
N                   68157   
----------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001

While I know for sure the samples when excluding fa_occ_fixed==100 and mo_occ_fixed==100 should be different. Any thoughts?

Comment

Andrew Musau

Join Date: Oct 2014

Posts: 10285
#4

30 Aug 2021, 08:41

You may be running into precision problems specifying a particular value. See

Code:

help precision

Try specifying a range that includes that value, e.g.,

Code:

if !inrange(fa_occ_fixed, 99.99, 100.01)
Comment
Maria Ventura

Join Date: Jun 2018

Posts: 40
#5

30 Aug 2021, 08:48

Hi Andrew, thanks for this. It doesn't seem to solve the problem. I should also say fa_occ_fixed and mo_occ_fixed are discrete (integer) values. Also running separate models with reg does not incrr in the same issue.
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10285

30 Aug 2021, 08:55

Then you may be mistaken in believing that the original sample includes observations fa_occ_fixed=100 and/ or mo_occ_fixed =100.

Code:

qui sureg (SOf female i.year i.doby, nocons) (SOm female i.year i.doby, nocons) [aw=wt]
tab fa_occ_fixed if inrange(fa_occ_fixed, 90, 110) & e(sample)
tab mo_occ_fixed if inrange(mo_occ_fixed, 90, 110) & e(sample)

Comment

Maria Ventura

Join Date: Jun 2018
Posts: 40

30 Aug 2021, 09:59

Here's what I get, so those observations seem to be there

Code:

. qui sureg (SOf female i.year i.doby, nocons) (SOm female i.year i.doby, nocons) [aw=wt]

. tab fa_occ_fixed if inrange(fa_occ_fixed, 90, 110) & e(sample)

fa_occ_fixe |
          d |      Freq.     Percent        Cum.
------------+-----------------------------------
         91 |      1,933       25.91       25.91
         92 |        415        5.56       31.47
         93 |      3,618       48.49       79.96
        100 |      1,495       20.04      100.00
------------+-----------------------------------
      Total |      7,461      100.00

. tab mo_occ_fixed if inrange(mo_occ_fixed, 90, 110) & e(sample)

mo_occ_fixe |
          d |      Freq.     Percent        Cum.
------------+-----------------------------------
         91 |      4,740       14.42       14.42
         92 |        327        0.99       15.41
         93 |      1,332        4.05       19.47
        100 |     26,472       80.53      100.00
------------+-----------------------------------
      Total |     32,871      100.00

Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10285

30 Aug 2021, 10:33

EDITED: I see what is going on. sureg does not allow you to specify restrictions within equations; you need to do this outside the equations. Here is an example that replicates your issue and a suggested solution.

Code:

sysuse auto, clear
sureg (price weight length) (mpg weight disp turn)
*WITHIN EQUATION RESTRICTIONS NOT APPLIED
sureg (price weight length if !foreign) (mpg weight disp turn if rep78!=3)
*DO IT OUTSIDE THE EQUATIONS
sureg (price weight length) (mpg weight disp turn) if !foreign&rep78!=3

Res.:

Code:

. sureg (price weight length) (mpg weight disp turn)

Seemingly unrelated regression
--------------------------------------------------------------------------
Equation             Obs   Parms        RMSE    "R-sq"       chi2        P
--------------------------------------------------------------------------
price                 74       2    2366.643    0.3474      40.38   0.0000
mpg                   74       3    3.373535    0.6553     141.74   0.0000
--------------------------------------------------------------------------

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
price        |
      weight |   4.859748   1.089747     4.46   0.000     2.723883    6.995612
      length |   -103.889   37.99741    -2.73   0.006    -178.3625    -29.4154
       _cons |   11015.55   4181.568     2.63   0.008     2819.826    19211.27
-------------+----------------------------------------------------------------
mpg          |
      weight |  -.0056691   .0013804    -4.11   0.000    -.0083747   -.0029636
displacement |   .0058645   .0095299     0.62   0.538    -.0128138    .0245428
        turn |  -.1977258   .1727109    -1.14   0.252     -.536233    .1407815
       _cons |   45.09757   4.786098     9.42   0.000     35.71699    54.47815
------------------------------------------------------------------------------

.
. *WITHIN EQUATION RESTRICTIONS NOT APPLIED

.
. sureg (price weight length if !foreign) (mpg weight disp turn if rep78!=3)

Seemingly unrelated regression
--------------------------------------------------------------------------
Equation             Obs   Parms        RMSE    "R-sq"       chi2        P
--------------------------------------------------------------------------
price                 74       2    2366.643    0.3474      40.38   0.0000
mpg                   74       3    3.373535    0.6553     141.74   0.0000
--------------------------------------------------------------------------

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
price        |
      weight |   4.859748   1.089747     4.46   0.000     2.723883    6.995612
      length |   -103.889   37.99741    -2.73   0.006    -178.3625    -29.4154
       _cons |   11015.55   4181.568     2.63   0.008     2819.826    19211.27
-------------+----------------------------------------------------------------
mpg          |
      weight |  -.0056691   .0013804    -4.11   0.000    -.0083747   -.0029636
displacement |   .0058645   .0095299     0.62   0.538    -.0128138    .0245428
        turn |  -.1977258   .1727109    -1.14   0.252     -.536233    .1407815
       _cons |   45.09757   4.786098     9.42   0.000     35.71699    54.47815
------------------------------------------------------------------------------

.
. *DO IT OUTSIDE THE EQUATIONS

.
. sureg (price weight length) (mpg weight disp turn) if !foreign&rep78!=3

Seemingly unrelated regression
--------------------------------------------------------------------------
Equation             Obs   Parms        RMSE    "R-sq"       chi2        P
--------------------------------------------------------------------------
price                 25       2    1847.739    0.3151      13.74   0.0010
mpg                   25       3    1.951532    0.8590     153.70   0.0000
--------------------------------------------------------------------------

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
price        |
      weight |   5.829179   2.121084     2.75   0.006     1.671931    9.986427
      length |  -141.1601   68.26321    -2.07   0.039    -274.9535   -7.366617
       _cons |   14365.48   6913.129     2.08   0.038     815.9994    27914.97
-------------+----------------------------------------------------------------
mpg          |
      weight |  -.0062347   .0014385    -4.33   0.000    -.0090542   -.0034153
displacement |  -.0044553   .0079334    -0.56   0.574    -.0200044    .0110939
        turn |  -.0051424   .2027157    -0.03   0.980    -.4024579     .392173
       _cons |   41.75658   5.477418     7.62   0.000     31.02104    52.49212
------------------------------------------------------------------------------

.

Last edited by Andrew Musau; 30 Aug 2021, 10:46.

Comment

Maria Ventura

Join Date: Jun 2018

Posts: 40
#9

30 Aug 2021, 10:41

Thanks! This makes sense! But then I guess I'd necessarily have to impose the same restrictions on both?

Last edited by Maria Ventura; 30 Aug 2021, 10:50.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10285
#10

30 Aug 2021, 10:49

See my edit in #8.
Comment
Maria Ventura

Join Date: Jun 2018

Posts: 40
#11

30 Aug 2021, 10:51

Sorry, should have added an answer instead of editing myself! Thought I was going to be faster..
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10285
#12

30 Aug 2021, 11:07

Yes, imposing separate restrictions implies that the estimation samples will differ across equations. sureg works with one estimation sample. If your reason for using sureg is just to combine estimates and you are dealing with different estimation samples, then don't use it. Going to your question in #1, you can stack the samples and jointly estimate your model.
Comment
Maria Ventura

Join Date: Jun 2018

Posts: 40
#13

31 Aug 2021, 03:52

Thanks! Do you mean something like the estimation part of this https://www.stata.com/support/faqs/s...-coefficients/ ? I am not sure that is ideal either as this method seems to assume the samples are separate while mine still partly overlap.
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10285

#14

31 Aug 2021, 05:43

That is exactly what suest assumes too, regardless of what number of observations it reports. Not convinced? Look at this example where one sample is a subsample of another.

Code:

sysuse auto, clear
regress mpg weight displacement if !foreign
estimates store m1
regress price weight displacement
estimates store m2
suest m1 m2
*STACKING
rename (mpg price) (depvar1 depvar2)
reshape long depvar, i(make) j(which)
replace depvar=. if foreign & which==1
gen cons=1
regress depvar i.which#(c.weight c.displacement c.cons), nocons robust
glm depvar i.which#(c.weight c.displacement c.cons), nocons robust

regress reports t-statistics and suest reports z-statistics. You can obtain output with z-statistics using maximum likelihood estimation (glm). As you can see, the results estimates are equivalent.

Res.:

Code:

. suest m1 m2

Simultaneous results for m1, m2

                                                Number of obs     =         74

------------------------------------------------------------------------------
             |               Robust
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
m1_mean      |
      weight |   -.005813   .0007952    -7.31   0.000    -.0073715   -.0042545
displacement |  -.0015662   .0068044    -0.23   0.818    -.0149025    .0117701
       _cons |   39.47539   1.785188    22.11   0.000     35.97649     42.9743
-------------+----------------------------------------------------------------
m1_lnvar     |
       _cons |    1.69459   .2940434     5.76   0.000     1.118275    2.270904
-------------+----------------------------------------------------------------
m2_mean      |
      weight |   1.823366   .7701043     2.37   0.018     .3139893    3.332743
displacement |   2.087054   7.334384     0.28   0.776    -12.28807    16.46218
       _cons |    247.907    1114.02     0.22   0.824    -1935.532    2431.346
-------------+----------------------------------------------------------------
m2_lnvar     |
       _cons |   15.66274   .1788432    87.58   0.000     15.31221    16.01327
------------------------------------------------------------------------------


. regress depvar i.which#(c.weight c.displacement c.cons), nocons robust

Linear regression                               Number of obs     =        126
                                                F(6, 120)         =     776.22
                                                Prob > F          =     0.0000
                                                R-squared         =     0.8694
                                                Root MSE          =     1937.1

--------------------------------------------------------------------------------------
                     |               Robust
              depvar |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------------+----------------------------------------------------------------
      which#c.weight |
                  1  |   -.005813   .0008093    -7.18   0.000    -.0074153   -.0042107
                  2  |   1.823366    .783772     2.33   0.022     .2715519     3.37518
                     |
which#c.displacement |
                  1  |  -.0015662   .0069251    -0.23   0.821    -.0152775    .0121451
                  2  |   2.087054   7.464554     0.28   0.780    -12.69224    16.86635
                     |
        which#c.cons |
                  1  |   39.47539   1.816872    21.73   0.000     35.87811    43.07267
                  2  |    247.907   1133.792     0.22   0.827    -1996.922    2492.736
--------------------------------------------------------------------------------------

.
. glm depvar i.which#(c.weight c.displacement c.cons), nocons robust

Iteration 0:   log pseudolikelihood = -1129.4019  

Generalized linear models                         Number of obs   =        126
Optimization     : ML                             Residual df     =        120
                                                  Scale parameter =    3752480
Deviance         =  450297612.9                   (1/df) Deviance =    3752480
Pearson          =  450297612.9                   (1/df) Pearson  =    3752480

Variance function: V(u) = 1                       [Gaussian]
Link function    : g(u) = u                       [Identity]

                                                  AIC             =   18.02225
Log pseudolikelihood =  -1129.40191               BIC             =   4.50e+08

--------------------------------------------------------------------------------------
                     |               Robust
              depvar |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------------------+----------------------------------------------------------------
      which#c.weight |
                  1  |   -.005813   .0007929    -7.33   0.000    -.0073671   -.0042589
                  2  |   1.823366   .7679366     2.37   0.018     .3182379    3.328494
                     |
which#c.displacement |
                  1  |  -.0015662   .0067852    -0.23   0.817     -.014865    .0117326
                  2  |   2.087054   7.313739     0.29   0.775    -12.24761    16.42172
                     |
        which#c.cons |
                  1  |   39.47539   1.780163    22.18   0.000     35.98634    42.96445
                  2  |    247.907   1110.884     0.22   0.823    -1929.386      2425.2
--------------------------------------------------------------------------------------

.

Comment

Maria Ventura

Join Date: Jun 2018

Posts: 40
#15

31 Aug 2021, 07:45

I see Andrew, thanks! I guess I thought introducing a stacked sample with similar observations would introduce other types of correlation, but I imagine that this is exactly the correlation we'd want to account for when estimating a sur model?

On a more general note, would you say it matters to test for correlation of residuals before using this types of estimation?
Comment

Announcement