Problem with Prob > F at OLS regression for baseline model

Paul Olz

Join Date: Nov 2022

Posts: 27
#1

Problem with Prob > F at OLS regression for baseline model

04 Jan 2023, 03:50

Hello everyone,

unfortunately, I discovered a surprising problem looking at the final output of my analysis today. In general, I am testing the direct effect of X on Y as well as two moderators (m1, m2). The first regression model just looks at the DV and its controls, the second adds the direct effect, the third the first moderator, ...
All my R-squared values and F statistics are fine, unless the first Prob > F, which is with 0.35 not satisfactory. Did anybody discover a similar problem once or know how to handle this?

Due to the occurence of autocorrelation and heteroscedasticity, I added fe and vce(robsut) to my -xtreg- command. When only put the fixed effects to the first "baseline" model, The Prob > F value changes to 0.000. Can I remove this for the first model?

Thanks for any support regarding this topic!
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#2

04 Jan 2023, 04:02

Paul:
as per FAQ, please post what you typed and what Stata gave you back. Thanks.

Last edited by Carlo Lazzaro; 04 Jan 2023, 04:24.

Kind regards,
Carlo
(Stata 19.0)
Comment
Paul Olz

Join Date: Nov 2022

Posts: 27
#3

04 Jan 2023, 04:17

Input:

xtreg Y c1 c2 c3 c4 c5, fe vce(robust)

Output:

Fixed-effects (within) regression Number of obs = 1,449
Group variable: id Number of groups = 484

R-squared: Obs per group:
Within = 0.0305 min = 1
Between = 0.0413 avg = 3.8
Overall = 0.0306 max = 9

F(5,381) = 1.11
corr(u_i, Xb) = -0.0030 Prob > F = 0.3566

(Std. err. adjusted for 484 clusters in id)
-----------------------------------------------------------------------------------
| Robust
Y| Coefficient std. err. t P>|t| [95% conf. interval]
------------------+----------------------------------------------------------------
c1 | -.0007456 .0007229 -1.03 0.303 -.002167 .0006757
c2 | .8998507 .5137411 1.75 0.081 -.1102722 1.909974
c3 | -.0649551 .0473791 -1.37 0.171 -.1581124 .0282023
c4 | -1.40e-07 3.30e-06 -0.04 0.966 -6.62e-06 6.34e-06
c5 | -.1907711 .094021 -2.03 0.043 -.3756361 -.0059061
_cons | 3.584927 .7621762 4.70 0.000 2.086329 5.083526
------------------+----------------------------------------------------------------
sigma_u | 1.5882512
sigma_e | 1.1893466
rho | .64071273 (fraction of variance due to u_i)
-----------------------------------------------------------------------------------

.
end of do-file

.
end of do-file
Comment
Paul Olz

Join Date: Nov 2022

Posts: 27
#4

04 Jan 2023, 04:21

sorry for the bad formatting, I do not know why it changed when I posted it ...

The Prob > F = 0.3566 is my main problem

Here the output if I remove the vce(robust):

. xtreg Y c1 c2 c3 c4 c5, fe

F(5,1080) = 6.79

corr(u_i, Xb) = -0.0030 Prob > F = 0.0000

-----------------------------------------------------------------------------------

Y | Coefficient Std. err. t P>|t| [95% conf. interval]

------------------+----------------------------------------------------------------

c1 | -.0007456 .0003232 -2.31 0.021 -.0013799 -.0001114

c2 | .8998507 .2292307 3.93 0.000 .4500628 1.349639

c3 | -.0649551 .0230962 -2.81 0.005 -.1102736 -.0196365

c4 | -1.40e-07 6.92e-06 -0.02 0.984 -.0000137 .0000134

c5 | -.1907711 .0801047 -2.38 0.017 -.3479496 -.0335926

_cons | 3.584927 .3353477 10.69 0.000 2.926921 4.242934

------------------+----------------------------------------------------------------

sigma_u | 1.5882512

sigma_e | 1.1893466

rho | .64071273 (fraction of variance due to u_i)

-----------------------------------------------------------------------------------

F test that all u_i=0: F(381, 1080) = 6.59 Prob > F = 0.0000

Last edited by Paul Olz; 04 Jan 2023, 04:23.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#5

04 Jan 2023, 04:28

Paul:
I would not be worried about a non-significant F-test; it's the very low Within Rsq= 0.0305 that hits here.
In all likelihhod, despite the evidence of a panel-wise effect, your model needs more predictors and/or interactions.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Paul Olz

Join Date: Nov 2022

Posts: 27
#6

04 Jan 2023, 04:33

First of all, thank you very much Carlo for your really fast reply!!

When I add more variables and moderating effects, the F-tests gets significant. Therfore, I was concerned about describing my findings, because I am unsure about the relevance of having one insignificant model among 7 models in general.

But summarizing your suggestion: I should not worry about one non-significant F-test, if the others are significant and the Rsq increases constantly. Am I right?
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17712

04 Jan 2023, 04:56

Paul:
generally speaking, your intepretation is correct.
Basically, the F-test tells you whether the mean of the dependent variable is equally (p>0.005) or less informative (P<0.05) that the regression.
However, in this case, the F-test warns about that possible misspecification of the functional form of the regressand (that, extensively, can be read as the misspecification of the entire model), that should be checked as per the following toy-example:

Code:

. use "https://www.stata-press.com/data/r17/nlswork.dta"
(National Longitudinal Survey of Young Women, 14-24 years old in 1968)

. xtreg ln_wage c.age##c.age i.year, fe vce(cluster idcode)

Fixed-effects (within) regression               Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-squared:                                      Obs per group:
     Within  = 0.1162                                         min =          1
     Between = 0.1078                                         avg =        6.1
     Overall = 0.0932                                         max =         15

                                                F(16,4709)        =      79.11
corr(u_i, Xb) = 0.0613                          Prob > F          =     0.0000

                             (Std. err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         age |   .0728746    .013687     5.32   0.000     .0460416    .0997075
             |
 c.age#c.age |  -.0010113   .0001076    -9.40   0.000    -.0012224   -.0008003
             |
        year |
         69  |   .0647054   .0155249     4.17   0.000     .0342693    .0951415
         70  |   .0284423   .0264639     1.07   0.283    -.0234395     .080324
         71  |   .0579959   .0384111     1.51   0.131    -.0173078    .1332996
         72  |   .0510671   .0502675     1.02   0.310    -.0474808     .149615
         73  |   .0424104   .0624924     0.68   0.497    -.0801038    .1649247
         75  |   .0151376    .086228     0.18   0.861    -.1539096    .1841848
         77  |   .0340933   .1106841     0.31   0.758    -.1828994     .251086
         78  |   .0537334   .1232232     0.44   0.663    -.1878417    .2953084
         80  |   .0369475   .1473725     0.25   0.802    -.2519716    .3258667
         82  |   .0391687   .1715621     0.23   0.819    -.2971733    .3755108
         83  |    .058766   .1836086     0.32   0.749    -.3011928    .4187249
         85  |   .1042758   .2080199     0.50   0.616    -.3035406    .5120922
         87  |   .1242272   .2327328     0.53   0.594    -.3320379    .5804922
         88  |   .1904977   .2486083     0.77   0.444    -.2968909    .6778863
             |
       _cons |   .3937532   .2469015     1.59   0.111    -.0902893    .8777957
-------------+----------------------------------------------------------------
     sigma_u |  .40275174
     sigma_e |  .30127563
         rho |  .64120306   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. predict fitted, xb
(24 missing values generated)

. g sq_fitted=fitted^2
(24 missing values generated)

. xtreg ln_wage fitted sq_fitted , fe vce(cluster idcode)

Fixed-effects (within) regression               Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-squared:                                      Obs per group:
     Within  = 0.1164                                         min =          1
     Between = 0.1094                                         avg =        6.1
     Overall = 0.0941                                         max =         15

                                                F(2,4709)         =     586.29
corr(u_i, Xb) = 0.0619                          Prob > F          =     0.0000

                             (Std. err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      fitted |   2.012332   .5365254     3.75   0.000     .9604909    3.064172
   sq_fitted |  -.3040363   .1616996    -1.88   0.060    -.6210431    .0129706
       _cons |  -.8379964    .443929    -1.89   0.059    -1.708305    .0323122
-------------+----------------------------------------------------------------
     sigma_u |  .40239556
     sigma_e |  .30114591
         rho |  .64099409   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. test sq_fitted

 ( 1)  sq_fitted = 0

       F(  1,  4709) =    3.54
            Prob > F =    0.0601

.

As the -test- outcome does not reject the null, there's no evidence of model misspecification.

Last edited by Carlo Lazzaro; 04 Jan 2023, 04:59.

Kind regards,
Carlo
(Stata 19.0)

Announcement

Problem with Prob > F at OLS regression for baseline model

Comment

Comment

Comment

Comment

Comment

Comment