Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Heteroskedasticity and Multicollinearity

    When I use the command (estat hettest) after doing the panel regression with re and fe then hausman then hausman, sigmamore I got and error r(321), I don't know why this error, I searched a lot online about it but did not find a solution.

    I also want to test for multicollinearity but don't know the command.

    Here attached the message I get from stata when I do the hettest.

    Thank you.
    Attached Files

  • #2
    Mustapha:
    -estat hettest- is not allowed after -xt- suite commands.
    Go visual fo a heteroskedasticity test.
    For the future, please report between CODE delimiters the error message thrown by Stata; do not attach screenshots. Thanks.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Thank you Mr. Carlo,

      But actually I don't understand what is the meaning of going visual for a heteroskedasticity test. Sorry, I am new to STATA.

      Regards.

      Comment


      • #4
        Mustapha:
        my bad, probably.
        I meant to plot espilon residual distribution vs fitted values via a scatterplot.
        If you detect heteroskedastcity and/or autocorrelation, just invoke -robust- or -cluster- options and go on with your panel data regression.
        Obviously, Carlo is enough!
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Thanks Carlo,

          I still have a problem. I cannot rvfplot is not working also with me. I get the following error message:

          error . . . . . . . . . . . . . . . . . . . . . . . . Return code 301
          last estimates not found;
          You typed an estimation command, such as regress, without
          arguments or attempted to perform a test or typed predict,
          but there were no previous estimation results.


          Is it a problem with my stata software or these commands don't work with xtreg?

          Thanks.

          Comment


          • #6
            Mustapha:
            -rvfplot- does not work after -xtreg-.
            You have to obtain both fitted values and epsilon residuals via -predict- and then create a -scatter- plot with this two variables to investigare potenziali heteroskedasticity..
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Thanks Carlo.

              Also predict is not working, I get a message that it is already defined when I choose a variable.

              But I did the scatter plot with fitted values. If you would like to take a look at it and guide me if it will be enough in my case or not I'll be grateful.

              Regrads.

              Comment


              • #8
                Mustapha,
                To clarify, you must create a new variable when using predict (syntax: predict [what value you want to predict] newvariable) where the name of the new variable does not already exist in your dataset. If you use predict more than once, you must choose a different variable name every time or clear your variable previously created through predict.

                Comment


                • #9
                  Mustapha:
                  I do hope that the folowing toy-example can be helpful:
                  Code:
                  . use "https://www.stata-press.com/data/r16/nlswork.dta"
                  (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
                  
                  . xtreg ln_wage age tenure, fe
                  
                  Fixed-effects (within) regression               Number of obs     =     28,101
                  Group variable: idcode                          Number of groups  =      4,699
                  
                  R-sq:                                           Obs per group:
                       within  = 0.1296                                         min =          1
                       between = 0.1916                                         avg =        6.0
                       overall = 0.1456                                         max =         15
                  
                                                                  F(2,23400)        =    1742.76
                  corr(u_i, Xb)  = 0.1302                         Prob > F          =     0.0000
                  
                  ------------------------------------------------------------------------------
                       ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                  -------------+----------------------------------------------------------------
                           age |   .0121949   .0004131    29.52   0.000     .0113852    .0130045
                        tenure |   .0211313   .0008015    26.37   0.000     .0195604    .0227023
                         _cons |   1.256467   .0109792   114.44   0.000     1.234947    1.277987
                  -------------+----------------------------------------------------------------
                       sigma_u |  .39034493
                       sigma_e |  .29808194
                           rho |  .63165531   (fraction of variance due to u_i)
                  ------------------------------------------------------------------------------
                  F test that all u_i=0: F(4698, 23400) = 8.02                 Prob > F = 0.0000
                  
                  . predict fitted, xb
                  (433 missing values generated)
                  
                  . predict epsilon, e
                  (433 missing values generated)
                  
                  . twoway (scatter epsilon fitted)
                  
                  . xtreg ln_wage age tenure, fe vce(cluster idcode)
                  
                  Fixed-effects (within) regression               Number of obs     =     28,101
                  Group variable: idcode                          Number of groups  =      4,699
                  
                  R-sq:                                           Obs per group:
                       within  = 0.1296                                         min =          1
                       between = 0.1916                                         avg =        6.0
                       overall = 0.1456                                         max =         15
                  
                                                                  F(2,4698)         =     766.79
                  corr(u_i, Xb)  = 0.1302                         Prob > F          =     0.0000
                  
                                               (Std. Err. adjusted for 4,699 clusters in idcode)
                  ------------------------------------------------------------------------------
                               |               Robust
                       ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                  -------------+----------------------------------------------------------------
                           age |   .0121949   .0007414    16.45   0.000     .0107414    .0136483
                        tenure |   .0211313   .0012112    17.45   0.000     .0187568    .0235059
                         _cons |   1.256467   .0194187    64.70   0.000     1.218397    1.294537
                  -------------+----------------------------------------------------------------
                       sigma_u |  .39034493
                       sigma_e |  .29808194
                           rho |  .63165531   (fraction of variance due to u_i)
                  ------------------------------------------------------------------------------
                  
                  .
                  The scatteplot shows heteroskedasticity and the subsequent adoption of cluster-robust standard errors support that evidence, since they are larger that the default ones.
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #10
                    Hello Carlo Lazzaro ,

                    It seems that the analysis has heteroskedasticity as follows:


                    Click image for larger version

Name:	Graph.png
Views:	2
Size:	30.4 KB
ID:	1552735


                    and concerning the multicollinearity, this the Correlation matrix of coefficients of xtreg model:

                    Code:
                          |                                                                              o. 1.OilE~g#          
                            e(V) | Manufa~a  Foreig~i  Grosss~S  Inflat~l  Highte~a  Popula~l  School~y  OilExp~g  c.Manu~a     _cons
                    -------------+----------------------------------------------------------------------------------------------------
                    Manufactur~a |   1.0000                                                                                          
                    Foreigndir~i |  -0.1706    1.0000                                                                                
                    Grosssavin~S |  -0.1738    0.4278    1.0000                                                                      
                    InflationG~l |   0.1167   -0.1280   -0.2759    1.0000                                                            
                    Hightechno~a |   0.0703   -0.0183   -0.0214    0.3656    1.0000                                                  
                    Population~l |  -0.0762    0.0472   -0.0581    0.0007   -0.0359    1.0000                                        
                    Schoolenro~y |  -0.1090   -0.0086   -0.3732   -0.0842   -0.1214    0.0130    1.0000                              
                    o.OilExpor~g |        .         .         .         .         .         .         .         .                    
                    1.OilExpor~g#|                                                                                                    
                    c.Manufact~a |  -0.5660    0.1188    0.0931   -0.4755   -0.2640   -0.0600    0.0942         .    1.0000          
                           _cons |   0.1242   -0.1953    0.1263    0.0018   -0.1958   -0.2031   -0.8862         .    0.0062    1.0000
                    Does this mean that I have multicollinearity?

                    And if I have both heteroskedasticity and multicollinearity what should I do in this case?

                    Thank you for helping out.

                    Regards.
                    Attached Files
                    Last edited by Mustapha Younis; 12 May 2020, 10:21.

                    Comment


                    • #11
                      Mustapha:
                      as heteroskedasticity seems apparent, just go -xtreg- with -robust- or -cluster()- standard errors; this way you will take also autocorrelation into account If any).
                      In addition (as I missed the last part of your query, I'm editing the current reply): multicollinearity is, in general, not an issue, as oftentimes reminded on this forum. That said, a sign of nasty multicollinearity is the evidence of "weird" standard errors.
                      Last edited by Carlo Lazzaro; 12 May 2020, 11:03.
                      Kind regards,
                      Carlo
                      (Stata 19.0)

                      Comment


                      • #12
                        Thank you Carlo,

                        I just don't know the exact command that I should use and in which step? Is it the final step?

                        Regards.

                        Comment


                        • #13
                          Mustapha:
                          elaborating on my previous reply:
                          Code:
                          . use "https://www.stata-press.com/data/r16/nlswork.dta"
                          (National Longitudinal Survey. Young Women 14-26 years of age in 1968)
                          
                          . xtreg ln_wage age tenure, fe vce(cluster idcode)
                          Last edited by Carlo Lazzaro; 13 May 2020, 02:41.
                          Kind regards,
                          Carlo
                          (Stata 19.0)

                          Comment


                          • #14
                            Dear Carlo,

                            I see that the command contains fe. I just want to elaborate my model here may be you can help because this is my first model ever.

                            Code:
                            xtreg GDPgrowthannual Manufacturingvalueaddedannua Foreigndirectinvestmentneti GrosssavingscurrentUS InflationGDPdeflatorannual Hightechnologyexportsofma Populationgrowthannual Schoolenrollmentsecondary OilExporting c.Manufacturingvalueaddedannua#OilExporting, re
                            estimates store random
                            xtreg GDPgrowthannual Manufacturingvalueaddedannua Foreigndirectinvestmentneti GrosssavingscurrentUS InflationGDPdeflatorannual Hightechnologyexportsofma Populationgrowthannual Schoolenrollmentsecondary OilExporting c.Manufacturingvalueaddedannua# OilExporting, fe
                            estimates store fixed
                            hausman fixed random
                            hausman fixed random, sigmamore
                            predict fitted, xb
                            predict epsilon, e
                            twoway (scatter epsilon fitted)
                            estat vce, corr
                            I did a regression with random effect, then a regression with fixed effect, then the hausman test to choose between re or fe and the result is in favor of re.

                            In the previously mentioned command xtreg ln_wage age tenure, fe vce(cluster idcode) , should I put re instead of fe?

                            Comment


                            • #15
                              Mustapha,
                              yes you can well replace -fe- with -re- specification in your -xtreg- code.
                              That said, if you want to keep non-default standard errors (eg, due to heteroskedasticity of epsilon distribution), -hausman- won't do and you have to switch to the community-contributed command -xtoverid-, which, being a bit old-fashioned- does not support -fvvarlist-notation. The usual fix then is to prefix your -xtreg- code with -xi:-.
                              Eventually, -xtoverid- needs only the -re- specification to point you towards -re- (the null hypothesis) or -fe-:
                              Code:
                              xi: xtreg GDPgrowthannual Manufacturingvalueaddedannua Foreigndirectinvestmentneti GrosssavingscurrentUS InflationGDPdeflatorannual Hightechnologyexportsofma Populationgrowthannual Schoolenrollmentsecondary OilExporting c.Manufacturingvalueaddedannua#OilExporting, re vce (cluster panelid)
                              xtoverid
                              Kind regards,
                              Carlo
                              (Stata 19.0)

                              Comment

                              Working...
                              X