OLS regression - Statalist

HADYSYAM JUNAIDI

Join Date: Aug 2016

Posts: 10
#16

01 Sep 2016, 04:06

Dear Carlo,

The -depvar- is a dichotomous scoring of 0 and 1, which have been converted into a continuous score (i.e. ratio).

With regard to -xtreg-, what is the sequence of command available in Stata to cater for it?

Do I need to conduct diagnostic checks (multicollinearity, heteroskedasticity and serial correlation) before deciding the best choice of OLS, say if the results indicate serial correlation problems.

Thank you.

Hadysyam
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17713
#17

01 Sep 2016, 04:27

Hadysiam:
thanks for providing further details.
The code for a linear panel data regression is (for further details, please take a look at -xtreg- entry in Stata .pdf manual):

Code:

xtset <panelid> <datawaveid> xtreg <depvar> <indepvars>, fe///if -hausman- test suggests fixed effect specification xtreg <depvar> <indepvars>, re///if -hausman- test suggests random effect specification

If you suspect heteroskedascticity and/or serial correlation, use -robust- standard errors (please note that under -xt- commands -robust- option works as the same as -cluster- option, whereas it is not true for -regress-, where you should use the -cluster- option if you want to run a pooled OLS).
Eventually, please note that:
-hausman- test does not work with robustified or clustered standard errors;
- Stata omits variables when (extremely) collinear.

Kind regards,
Carlo
(Stata 19.0)
Comment
Jimmy Yang

Join Date: May 2015

Posts: 54
#18

01 Sep 2016, 04:58

Hi, it is overfitting from my point of view. You can use

Code:

simulate

to determine whether it is overfitting.
Probably Bayesian approach is more pertinent for this research.
Comment
HADYSYAM JUNAIDI

Join Date: Aug 2016

Posts: 10
#19

02 Sep 2016, 01:56

Dear Statalist members,

- I have conducted the appropriate test among Pooled OLS, Random Effect (RE) and Fixed Effect models, which generate results as follows:

1. Pooled OLS vs. RE, the Breusch and Pagan Lagrangian multiplier test support RE

Code:

Prob > chibar2 = 0.0000

2. RE vs. FE model, the Hausman test opt for RE

Code:

Prob>chi2 = 0.7298

- Also, I have performed diagnostic checks which indicate results as follows:
1. Multicollinearity

Code:

Mean VIF | 2.92

2. Serial correlation

Code:

Prob > F = 0.0341

- However, heteroskedasticity indicates two different results by using Breusch-Pagan / Cook-Weisberg and White's test:

Code:

Breusch-Pagan / Cook-Weisberg test for heteroskedasticity Ho: Constant variance Variables: fitted values of lsladi chi2(1) = 3.02 Prob > chi2 = 0.0823

Code:

White's test for Ho: homoskedasticity against Ha: unrestricted heteroskedasticity chi2(14) = 47.54 Prob > chi2 = 0.0000

- I'm not sure if can use the result from White's test which indicates heteroskedasticity problems.

- Based on the earlier tests which suggest for RE and assume that the diagnostic checks reveal serial correlation and heteroskedasticity problems, what would be the appropriate command to run the OLS regression.

Regards,

Hadysyam
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17713
#20

02 Sep 2016, 02:05

Hadysyam:
after a bit of debate, I can't get why you do not want to use -xtreg- but stick with pooled OLS (which does not seem to have indication, in your case), instead.
That said, you can accomodate for heteroskedasticity and/or serial correlation by impposing -robust- (or. equivalently, -cluster-) standard error:

Code:

xtset <panelid> <datawaveid> xtreg <depvar> <indepvars>, re vce(robust)

Kind regards,
Carlo
(Stata 19.0)
Comment
HADYSYAM JUNAIDI

Join Date: Aug 2016

Posts: 10
#21

02 Sep 2016, 03:16

Dear Carlo,

Now, I'm much clearer about the right method in my analysis. Perhaps, I was a bit influenced by the previous disclosure studies which seem to use OLS linear regression in their analysis.

Pertaining to the issue of heteroskedasticity of which Breusch-Pagan / Cook-Weisberg and White's test produce two different results, do I have to conduct another test to validate the evidence of heteroskedasticity such as plotting the residuals versus fitted (predicted) values?

Thank you so much for your guidance.

Hadysyam
Comment
Jesse Wursten

Join Date: Jan 2016

Posts: 915
#22

02 Sep 2016, 03:22

Although I suppose these things vary by field, most people simply use "robust" standard errors always. I haven't seen a heteroskedasticity test in an economic journal in ages.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17713
#23

02 Sep 2016, 04:00

Hadysyam:
just go -robust- with -xtreg- and make your life simpler!

Kind regards,
Carlo
(Stata 19.0)
Comment

HADYSYAM JUNAIDI

Join Date: Aug 2016
Posts: 10

#24

06 Sep 2016, 07:47

Dear Statalist members,

As my data is having heteroskedasticity and serial correlation problems, I have tried -robust- with xtreg command for the random effect model (based on hausman test). The result is as follows:

Code:

. xtreg sladi ti lsz qp ai rg, re vce(robust)

Random-effects GLS regression                   Number of obs     =        104
Group variable: code                            Number of groups  =         26

R-sq:                                           Obs per group:
     within  = 0.0019                                         min =          4
     between = 0.9975                                         avg =        4.0
     overall = 0.9969                                         max =          4

                                                Wald chi2(4)      =          .
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =          .

                                  (Std. Err. adjusted for 26 clusters in code)
------------------------------------------------------------------------------
             |               Robust
       sladi |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          ti |   .0096984    .000045   215.40   0.000     .0096102    .0097867
         lsz |  -.0013554   .0008318    -1.63   0.103    -.0029856    .0002748
          qp |   .0176582   .0107297     1.65   0.100    -.0033717    .0386881
          ai |  -.0002205   .0007222    -0.31   0.760    -.0016361     .001195
          rg |   .5499816   .0099783    55.12   0.000     .5304245    .5695388
       _cons |   .3376315   .0115353    29.27   0.000     .3150228    .3602402
-------------+----------------------------------------------------------------
     sigma_u |  .00820891
     sigma_e |  .00454183
         rho |  .76562666   (fraction of variance due to u_i)
------------------------------------------------------------------------------

So far I haven't found any accounting disclosure literatures using random effect model (GLS). Due to this, I have also tried using the Stata regress command, which includes a robust for estimating the standard errors, and the result yields as below:

Code:

. regress sladi ti lsz qp ai rg, robust

Linear regression                               Number of obs     =        104
                                                F(5, 98)          >   99999.00
                                                Prob > F          =     0.0000
                                                R-squared         =     0.9969
                                                Root MSE          =     .00868

------------------------------------------------------------------------------
             |               Robust
       sladi |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          ti |   .0096911   .0006034    16.06   0.000     .0084937    .0108885
         lsz |  -.0012211   .0006602    -1.85   0.067    -.0025312     .000089
          qp |   .0173349   .0056821     3.05   0.003     .0060589    .0286109
          ai |  -.0012297   .0022064    -0.56   0.579    -.0056081    .0031488
          rg |   .5496299   .0055649    98.77   0.000     .5385865    .5606733
       _cons |   .3359687   .0092213    36.43   0.000     .3176694    .3542681
------------------------------------------------------------------------------

- May I get some opinions and interpretation on the above results?

Regards,

Hadysyam

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17713
#25

06 Sep 2016, 08:15

Hadysyam:
- as Jimmy noted, your regression models suffers from overfitting (too many predictors for a quite small sample size). You should also note the sky-rocketing R2 despite not all the coefficients reaching statistical significance: I would suspect a quasi-multicollinearity issue with your data.
-you should have used a clustered SE in the pooled OLS, as your observatiions are not independent.
Please note that, unlike -xtreg-, the -robust- option for -regress- accomodate for heteroskedasticity only.
To wrap up, you're too much out of your data: you need a more parsimonious regression model, no matter if you go pooled OLS or -xtreg-.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment