Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Dear Carlo,

    The -depvar- is a dichotomous scoring of 0 and 1, which have been converted into a continuous score (i.e. ratio).

    With regard to -xtreg-, what is the sequence of command available in Stata to cater for it?

    Do I need to conduct diagnostic checks (multicollinearity, heteroskedasticity and serial correlation) before deciding the best choice of OLS, say if the results indicate serial correlation problems.

    Thank you.

    Hadysyam

    Comment


    • #17
      Hadysiam:
      thanks for providing further details.
      The code for a linear panel data regression is (for further details, please take a look at -xtreg- entry in Stata .pdf manual):
      Code:
      xtset <panelid> <datawaveid>
      xtreg <depvar> <indepvars>, fe///if -hausman- test suggests fixed effect specification
      xtreg <depvar> <indepvars>, re///if -hausman- test suggests random effect specification
      If you suspect heteroskedascticity and/or serial correlation, use -robust- standard errors (please note that under -xt- commands -robust- option works as the same as -cluster- option, whereas it is not true for -regress-, where you should use the -cluster- option if you want to run a pooled OLS).
      Eventually, please note that:
      -hausman- test does not work with robustified or clustered standard errors;
      - Stata omits variables when (extremely) collinear.
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #18
        Hi, it is overfitting from my point of view. You can use
        Code:
        simulate
        to determine whether it is overfitting.
        Probably Bayesian approach is more pertinent for this research.

        Comment


        • #19
          Dear Statalist members,

          - I have conducted the appropriate test among Pooled OLS, Random Effect (RE) and Fixed Effect models, which generate results as follows:

          1. Pooled OLS vs. RE, the Breusch and Pagan Lagrangian multiplier test support RE
          Code:
          Prob > chibar2 =   0.0000
          2. RE vs. FE model, the Hausman test opt for RE
          Code:
          Prob>chi2 =      0.7298
          - Also, I have performed diagnostic checks which indicate results as follows:
          1. Multicollinearity
          Code:
          Mean VIF |      2.92
          2. Serial correlation
          Code:
          Prob > F =      0.0341
          - However, heteroskedasticity indicates two different results by using Breusch-Pagan / Cook-Weisberg and White's test:
          Code:
          Breusch-Pagan / Cook-Weisberg test for heteroskedasticity 
                   Ho: Constant variance
                   Variables: fitted values of lsladi
          
                   chi2(1)      =     3.02
                   Prob > chi2  =   0.0823
          Code:
          White's test for Ho: homoskedasticity
                   against Ha: unrestricted heteroskedasticity
          
                   chi2(14)     =     47.54
                   Prob > chi2  =    0.0000
          - I'm not sure if can use the result from White's test which indicates heteroskedasticity problems.

          - Based on the earlier tests which suggest for RE and assume that the diagnostic checks reveal serial correlation and heteroskedasticity problems, what would be the appropriate command to run the OLS regression.

          Regards,

          Hadysyam

          Comment


          • #20
            Hadysyam:
            after a bit of debate, I can't get why you do not want to use -xtreg- but stick with pooled OLS (which does not seem to have indication, in your case), instead.
            That said, you can accomodate for heteroskedasticity and/or serial correlation by impposing -robust- (or. equivalently, -cluster-) standard error:
            Code:
             xtset <panelid> <datawaveid>
            xtreg <depvar> <indepvars>, re vce(robust)
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #21
              Dear Carlo,

              Now, I'm much clearer about the right method in my analysis. Perhaps, I was a bit influenced by the previous disclosure studies which seem to use OLS linear regression in their analysis.

              Pertaining to the issue of heteroskedasticity of which Breusch-Pagan / Cook-Weisberg and White's test produce two different results, do I have to conduct another test to validate the evidence of heteroskedasticity such as plotting the residuals versus fitted (predicted) values?

              Thank you so much for your guidance.

              Hadysyam

              Comment


              • #22
                Although I suppose these things vary by field, most people simply use "robust" standard errors always. I haven't seen a heteroskedasticity test in an economic journal in ages.

                Comment


                • #23
                  Hadysyam:
                  just go -robust- with -xtreg- and make your life simpler!
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #24
                    Dear Statalist members,

                    As my data is having heteroskedasticity and serial correlation problems, I have tried -robust- with xtreg command for the random effect model (based on hausman test). The result is as follows:

                    Code:
                    . xtreg sladi ti lsz qp ai rg, re vce(robust)
                    
                    Random-effects GLS regression                   Number of obs     =        104
                    Group variable: code                            Number of groups  =         26
                    
                    R-sq:                                           Obs per group:
                         within  = 0.0019                                         min =          4
                         between = 0.9975                                         avg =        4.0
                         overall = 0.9969                                         max =          4
                    
                                                                    Wald chi2(4)      =          .
                    corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =          .
                    
                                                      (Std. Err. adjusted for 26 clusters in code)
                    ------------------------------------------------------------------------------
                                 |               Robust
                           sladi |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                              ti |   .0096984    .000045   215.40   0.000     .0096102    .0097867
                             lsz |  -.0013554   .0008318    -1.63   0.103    -.0029856    .0002748
                              qp |   .0176582   .0107297     1.65   0.100    -.0033717    .0386881
                              ai |  -.0002205   .0007222    -0.31   0.760    -.0016361     .001195
                              rg |   .5499816   .0099783    55.12   0.000     .5304245    .5695388
                           _cons |   .3376315   .0115353    29.27   0.000     .3150228    .3602402
                    -------------+----------------------------------------------------------------
                         sigma_u |  .00820891
                         sigma_e |  .00454183
                             rho |  .76562666   (fraction of variance due to u_i)
                    ------------------------------------------------------------------------------
                    So far I haven't found any accounting disclosure literatures using random effect model (GLS). Due to this, I have also tried using the Stata regress command, which includes a robust for estimating the standard errors, and the result yields as below:
                    Code:
                    . regress sladi ti lsz qp ai rg, robust
                    
                    Linear regression                               Number of obs     =        104
                                                                    F(5, 98)          >   99999.00
                                                                    Prob > F          =     0.0000
                                                                    R-squared         =     0.9969
                                                                    Root MSE          =     .00868
                    
                    ------------------------------------------------------------------------------
                                 |               Robust
                           sladi |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                              ti |   .0096911   .0006034    16.06   0.000     .0084937    .0108885
                             lsz |  -.0012211   .0006602    -1.85   0.067    -.0025312     .000089
                              qp |   .0173349   .0056821     3.05   0.003     .0060589    .0286109
                              ai |  -.0012297   .0022064    -0.56   0.579    -.0056081    .0031488
                              rg |   .5496299   .0055649    98.77   0.000     .5385865    .5606733
                           _cons |   .3359687   .0092213    36.43   0.000     .3176694    .3542681
                    ------------------------------------------------------------------------------
                    - May I get some opinions and interpretation on the above results?

                    Regards,

                    Hadysyam

                    Comment


                    • #25
                      Hadysyam:
                      - as Jimmy noted, your regression models suffers from overfitting (too many predictors for a quite small sample size). You should also note the sky-rocketing R2 despite not all the coefficients reaching statistical significance: I would suspect a quasi-multicollinearity issue with your data.
                      -you should have used a clustered SE in the pooled OLS, as your observatiions are not independent.
                      Please note that, unlike -xtreg-, the -robust- option for -regress- accomodate for heteroskedasticity only.
                      To wrap up, you're too much out of your data: you need a more parsimonious regression model, no matter if you go pooled OLS or -xtreg-.
                      Kind regards,
                      Carlo
                      (Stata 19.0)

                      Comment

                      Working...
                      X