Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Pooled OLS - Durbin-Watzon?

    hello,
    I am analysing an small dataset with T=16 and N=6, and I am trying to get what is the appropriate test to use for checking autocorrelation with pooled OLS. Durbin-Watzon seems one of the test for checking autocorrelation but i am not sure for pooled OLS. I saw some post relate to this topic but still is not enough clear for me. Please, any comment relate to will be welcome, thanks in advance
    Last edited by Isabel Cour; 26 Apr 2018, 18:19.

  • #2
    Isabel:
    welcome to this forum.
    If you have a T>N panel dataset, why not considering -xtgls-?
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Carlo,

      Thank you very much for the answer. I use a gravity framework to examine the effect of technology on trade flows. First a pooled OLS was estimated; this model includes year and countries fixed effect.Then I checked robustness via GMM. if I use pooled OLS I must check autocorrelation, however how can address this issues? in this context.

      Comment


      • #4
        Isabel:
        you may want to consider something along the following lines:
        Code:
        . set obs 5
        number of observations (_N) was 0, now 5
        
        . g id=_n
        
        . expand 3
        (10 observations created)
        
        . bysort id: g time=_n
        
        . g y=runiform()*100
        
        . g x=runiform()*.5
        
        . reg y x, vce(cluster id)
        
        Linear regression                               Number of obs     =         15
                                                        F(1, 4)           =       0.82
                                                        Prob > F          =     0.4160
                                                        R-squared         =     0.0461
                                                        Root MSE          =     30.244
        
                                             (Std. Err. adjusted for 5 clusters in id)
        ------------------------------------------------------------------------------
                     |               Robust
                   y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                   x |  -43.13193      47.59    -0.91   0.416     -175.263    88.99911
               _cons |   75.74245   15.03339     5.04   0.007     34.00306    117.4818
        ------------------------------------------------------------------------------
        
        . predict uhat, residuals
        
        . forvalues j = 1/2 {
          2. quietly corr uhat L`j'.uhat
          3. di "Autocorrelation at lag `j' = " %6.3f r(rho)
          4.  }
        Autocorrelation at lag 1 = -0.058
        Autocorrelation at lag 2 = -0.731
        
        .
        
        *All credits for the code reported above goes out to https://www.stata.com/bookstore/microeconometrics-stata/, page 252*
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Dear Carlo,

          Thank you for your answer. I will try that command. Regards,

          Comment


          • #6
            Dear Carlo,

            Running the pooled OLS again I found that is dropping the gdp_exporter, also distance. That happens because I add year fixed effect. I am using a gravity framework I have one exporter and 6 importer, gdp_m gdp_x distance variables. I am estimating effect of tech on trade flows. Result shows a high coefficient of R, only the explicative variable is significant but not any control variables. However if I removed year fixed effect, gdp_m become significant, R squared remain almost the same.
            Any comment will be very welcome, thank you Carlo in advance.

            Kind regards,


            The equation is,
            Code:
            regress lnimpo lnintus lngdp_m lngdp_x lndist lnidist exporter_* importer_* year_*, cluster(id)
            Code:
            note: lngdp_exp omitted because of collinearity
            note: lndist omitted because of collinearity
            note: exporter_6 omitted because of collinearity
            note: importer_1 omitted because of collinearity
            note: year_1 omitted because of collinearity
            note: year_16 omitted because of collinearity
            
            Linear regression                               Number of obs     =         96
                                                            F(4, 5)           =          .
                                                            Prob > F          =          .
                                                            R-squared         =     0.9453
                                                            Root MSE          =     .48059
            
                                                 (Std. Err. adjusted for 6 clusters in id)
            ------------------------------------------------------------------------------
                         |               Robust
                  lnimpo |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                 lnintus |   .6558501   .1554177     4.22   0.008     .2563361    1.055364
                 lngdp_m |    .725604   .5322272     1.36   0.231    -.6425297    2.093738
                 lngdp_e |          0  (omitted)
                  lndist |          0  (omitted)
                 lnidist |  -.6969423   1.001261    -0.70   0.517    -3.270767    1.876882
              exporter_1 |  -.0359593   1.108767    -0.03   0.975    -2.886135    2.814216
              exporter_2 |   .3925353   1.775355     0.22   0.834    -4.171161    4.956232
              exporter_3 |   1.985593   1.325898     1.50   0.195    -1.422737    5.393923
              exporter_4 |  -1.726357   .8812586    -1.96   0.107    -3.991705    .5389901
              exporter_5 |   .5690413   .6103641     0.93   0.394    -.9999496    2.138032
              exporter_6 |          0  (omitted)
              importer_1 |          0  (omitted)
                  year_1 |          0  (omitted)
                  year_2 |  -.1791182   .1716109    -1.04   0.344     -.620258    .2620216
                  year_3 |  -.1937245   .2774412    -0.70   0.516    -.9069098    .5194608
                  year_4 |  -.1649017   .2791474    -0.59   0.580    -.8824728    .5526694
                  year_5 |   .1806592    .098292     1.84   0.125    -.0720084    .4333268
                  year_6 |   .1437831   .1277797     1.13   0.312    -.1846852    .4722513
                  year_7 |   .0889293   .1295385     0.69   0.523    -.2440601    .4219186
                  year_8 |   .2812087   .1120507     2.51   0.054    -.0068267    .5692442
                  year_9 |   .1691415   .1544995     1.09   0.324    -.2280121    .5662951
                 year_10 |  -.0819679   .1508932    -0.54   0.610    -.4698511    .3059154
                 year_11 |   .1149215   .1189615     0.97   0.378    -.1908788    .4207219
                 year_12 |   .0961005   .1533784     0.63   0.558    -.2981711    .4903721
                 year_13 |   .0659792   .1188738     0.56   0.603    -.2395956    .3715539
                 year_14 |   .0015395   .1104748     0.01   0.989     -.282445     .285524
                 year_15 |   .1724051   .1519819     1.13   0.308    -.2182769     .563087
                 year_16 |          0  (omitted)
                   _cons |   1.272487   13.05932     0.10   0.926    -32.29758    34.84255
            ------------------------------------------------------------------------------

            Comment


            • #7
              Isabel:
              a high R-sq with a handful of statistical significant coefficients shoul lead to suspect quasi-extreeme multicollinearity with your data.
              I would investigate that issue via -estat vif- after -regress-.
              Two asides:
              - statistical significant in itself is not the goal a guiven regression model should be targetted to; give a fair and true view of the data generating process, instead;
              - as far as categorical variables and interactions are concerned, I would exploit the wondeful capabilities of -frvvarlist-.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Dear Carlo,

                I checked it and yes it is as you said, and stata remove those variables, however the specification model is what you follow in this research question, perhaps because I am studying few countries . I am not using categorical variables but an interaction term of tecno. Thank you very much Carlo!

                Comment


                • #9
                  Isabel:
                  if you're using interactions among your predictors, -fvvarlist- is the notation to follow (see -help fvvarlist-).
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #10
                    thank you very much Carlo!

                    Comment


                    • #11
                      Dear Carlo,

                      I am running a new equation via pooled OLS as a benchmark model, then a ppml and ivpoisson. The framework is gravity model. OLS result shows as follow,
                      Code:
                      . reg lnamchsac lnintus lngdp lngdpch lndist lnidist lntariff exporter_* importer_* year_*, cluster(dist)
                      note: lngdpch omitted because of collinearity
                      note: lndist omitted because of collinearity
                      note: exporter_2 omitted because of collinearity
                      note: exporter_7 omitted because of collinearity
                      note: importer_1 omitted because of collinearity
                      note: year_1 omitted because of collinearity
                      note: year_2 omitted because of collinearity
                      note: year_4 omitted because of collinearity
                      note: year_5 omitted because of collinearity
                      note: year_6 omitted because of collinearity
                      note: year_21 omitted because of collinearity
                      
                      Linear regression                               Number of obs     =         86
                                                                      F(4, 5)           =          .
                                                                      Prob > F          =          .
                                                                      R-squared         =     0.8661
                                                                      Root MSE          =     1.4843
                      
                                                         (Std. Err. adjusted for 6 clusters in dist)
                      ------------------------------------------------------------------------------
                                   |               Robust
                         lnamchsac |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                      -------------+----------------------------------------------------------------
                         lnintus   |   1.472002   .4711239     3.12   0.026     .2609392    2.683064
                             lngdp |  -.1357364   1.708809    -0.08   0.940    -4.528369    4.256896
                           lngdpch |          0  (omitted)
                            lndist |          0  (omitted)
                           lnidist |  -.2363665    1.53543    -0.15   0.884    -4.183315    3.710582
                          lntariff |    .452788   .4531946     1.00   0.364    -.7121858    1.617762
                        exporter_1 |   -3.34698   2.296368    -1.46   0.205    -9.249982    2.556022
                        exporter_2 |          0  (omitted)
                        exporter_3 |  -5.926601   3.785382    -1.57   0.178    -15.65723    3.804032
                        exporter_4 |   -6.61961    3.46639    -1.91   0.114    -15.53025    2.291028
                        exporter_5 |  -9.048581   5.533931    -1.64   0.163      -23.274    5.176842
                        exporter_6 |  -8.964859    7.62402    -1.18   0.293    -28.56303    10.63331
                        exporter_7 |          0  (omitted)
                        importer_1 |          0  (omitted)
                            year_1 |          0  (omitted)
                            year_2 |          0  (omitted)
                            year_3 |   1.404595   1.105926     1.27   0.260    -1.438277    4.247467
                            year_4 |          0  (omitted)
                            year_5 |          0  (omitted)
                            year_6 |          0  (omitted)
                            year_7 |    1.25089    2.09155     0.60   0.576    -4.125611    6.627392
                            year_8 |  -1.600892   2.077176    -0.77   0.476    -6.940442    3.738659
                            year_9 |  -.2773236   2.954371    -0.09   0.929    -7.871777    7.317129
                           year_10 |  -.9183395    1.78202    -0.52   0.628    -5.499167    3.662488
                           year_11 |   .3214608    1.24989     0.26   0.807    -2.891484    3.534406
                           year_12 |  -.5872897   1.037801    -0.57   0.596    -3.255043    2.080463
                           year_13 |   .4736747   .9093405     0.52   0.625    -1.863859    2.811209
                           year_14 |  -.0355714   1.587321    -0.02   0.983    -4.115909    4.044767
                           year_15 |  -1.041163   .8873181    -1.17   0.293    -3.322087     1.23976
                           year_16 |  -.3526849   .9942915    -0.35   0.737    -2.908593    2.203223
                           year_17 |   .1482775   .7291278     0.20   0.847    -1.726005     2.02256
                           year_18 |  -.4334973    .625288    -0.69   0.519    -2.040851    1.173857
                           year_19 |   .6455891   1.510222     0.43   0.687    -3.236561    4.527739
                           year_20 |    .423954   .6250359     0.68   0.528    -1.182752     2.03066
                           year_21 |          0  (omitted)
                             _cons |   17.66975   47.75706     0.37   0.727    -105.0937    140.4332
                      ------------------------------------------------------------------------------
                      I checked the specification of the model and according to the test it does not have omit variables.However when I check collin I got there is a high correlation. For that the result via OLS cannot take it into account but to compare to other estimator in term of efficiency. if i want to correct the huge correlation then i need to remove that tech variable which is correlated to growth. However, i use ppml and then ivpoisson for correct endogeneity. My question is should I correct the harmful correlation of the benchmark model? Any comment is welcome! thank you in advance . Regards

                      Code:
                      collin lnintuser lngdp lngdpch lndist lnidist lntariff
                      (obs=97)
                      
                        Collinearity Diagnostics
                      
                                              SQRT                   R-
                        Variable      VIF     VIF    Tolerance    Squared
                      ----------------------------------------------------
                       lnintus       16.11    4.01    0.0621      0.9379
                           lngdp      1.14    1.07    0.8768      0.1232
                         lngdpch     15.11    3.89    0.0662      0.9338
                          lndist      1.18    1.09    0.8467      0.1533
                         lnidist      1.10    1.05    0.9056      0.0944
                        lntariff      1.86    1.36    0.5371      0.4629
                      ----------------------------------------------------
                        Mean VIF      6.08
                      
                                                 Cond
                              Eigenval          Index
                      ---------------------------------
                          1     6.1458          1.0000
                          2     0.5767          3.2646
                          3     0.2048          5.4781
                          4     0.0704          9.3449
                          5     0.0022         52.3621
                          6     0.0001        307.2682
                          7     0.0000        541.4443
                      ---------------------------------
                       Condition Number       541.4443 
                       Eigenvalues & Cond Index computed from scaled raw sscp (w/ intercept)
                       Det(correlation matrix)    0.0306

                      Comment


                      • #12
                        Isabel:
                        at its face value, the VIF problem referes to -lnintus- only, as Stata omits -lngdpch-.
                        I would be curious to see what happens if you remove -lnintus-.
                        Some other points concerning your regression model:
                        - you have a scant number of observations and your coefficient reflect that feature;
                        - I would recommend you to use -fvvralist- notation for both categorical variable and interactions creation.
                        Kind regards,
                        Carlo
                        (Stata 19.0)

                        Comment


                        • #13
                          Dear Carlo,

                          Thank you very much for your answer. I am trying to use -fvvralist-, but it gave this message,

                          Code:
                          regress lnamchsac lnintus lngdp lngdpch lndist lnidist i.exporter i.importer i.year
                          
                          exporter:  string variables may not be used as factor variables r(109);
                          The scant number of observation is because the sample is small and for developing countries. I can not remove lnintus becuase it is the explicative variables, I control via the other variables. Regards,

                          Comment


                          • #14
                            Dear Carlo,

                            I encode exporter and importer, changing to numerical variables. Then when I run the equation again, it omitted one exporter, one importer and to year, which is right but then when i check -vif-, it gave a very huge collineality. Why F( ) ad Prob are ommitted here? I am working with one importer country and 6 exporter countries, estimating the effect of tech on impo.

                            Code:
                            regress lnamchsac lnintus lngdp lngdpch lndist i.exporter2 i.importer2 i.year, cluster(dist)
                            note: 6.exporter2 omitted because of collinearity
                            note: 1.importer2 omitted because of collinearity
                            note: 2014.year omitted because of collinearity
                            note: 2015.year omitted because of collinearity
                            
                            Linear regression                               Number of obs     =        100
                                                                            F(4, 5)           =          .
                                                                            Prob > F          =          .
                                                                            R-squared         =     0.8721
                                                                            Root MSE          =     1.4311
                            
                                                               (Std. Err. adjusted for 6 clusters in dist)
                            ------------------------------------------------------------------------------
                                         |               Robust
                               lnamchsac |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                            -------------+----------------------------------------------------------------
                               lnintus  |   10.33298   3.910021     2.64   0.046     .2819487    20.38401
                                   lngdp |   .0284274   1.245275     0.02   0.983    -3.172655     3.22951
                                 lngdpch |  -12.51951   5.407564    -2.32   0.068    -26.42009    1.381076
                                  lndist |   175.8917   136.6446     1.29   0.254    -175.3644    527.1478
                                         |
                               exporter2 |
                                     BA  |   20.87398   15.70306     1.33   0.241    -19.49201    61.23998
                                     CL  |  -.9865379   .5896953    -1.67   0.155    -2.502398    .5293221
                                     CL  |   36.31381   30.45178     1.19   0.287    -41.96497    114.5926
                                     EU  |   26.85489   23.44419     1.15   0.304    -33.41031     87.1201
                                     PR  |          0  (omitted)
                                         |
                               importer2 |
                                    C    |          0  (omitted)
                                         |
                                    year |
                                   1996  |   4.553916   1.332133     3.42   0.019     1.129558    7.978274
                                   1997  |    6.71773   2.371106     2.83   0.037     .6226067    12.81285
                                   1998  |   3.874927   2.034063     1.91   0.115    -1.353798    9.103653
                                   1999  |    .347602   1.372735     0.25   0.810    -3.181125    3.876329
                                   2000  |  -.3890414   1.409714    -0.28   0.794    -4.012827    3.234744
                                   2001  |  -.9317118   2.013305    -0.46   0.663    -6.107076    4.243653
                                   2002  |  -6.600452   3.533016    -1.87   0.121    -15.68236    2.481456
                                   2003  |  -5.756303   4.268288    -1.35   0.235    -16.72829     5.21568
                                   2004  |  -5.531035   3.078244    -1.80   0.132    -13.44391    2.381843
                                   2005  |  -3.803244   2.140584    -1.78   0.136     -9.30579    1.699302
                                   2006  |   -4.33426   1.976139    -2.19   0.080    -9.414086    .7455656
                                   2007  |  -3.684837   1.924448    -1.91   0.114    -8.631788    1.262114
                                   2008  |  -3.934879   2.043277    -1.93   0.112     -9.18729    1.317532
                                   2009  |  -5.696587   2.047856    -2.78   0.039    -10.96077   -.4324058
                                   2010  |   -4.40091    2.16478    -2.03   0.098    -9.965653    1.163834
                                   2011  |  -2.456916   1.169522    -2.10   0.090    -5.463268     .549435
                                   2012  |  -1.947741   .9714993    -2.00   0.101     -4.44506    .5495772
                                   2013  |  -.3844076   1.064903    -0.36   0.733    -3.121827    2.353012
                                   2014  |          0  (omitted)
                                   2015  |          0  (omitted)
                                         |
                                   _cons |  -1382.636   1225.708    -1.13   0.311    -4533.418    1768.145
                            ------------------------------------------------------------------------------

                            Comment


                            • #15
                              Isabel:
                              about missing F and P see -help j_robustsingular-.
                              Stata tells you exacltly the reason for omitting some predictors: collinearity with other predictors.
                              High VIF can be read as a sign of model misspecification.
                              Eventually, I would also check whether it makes sense keeping -i-year- in your regression model via -testparm(i.year)-.
                              Kind regards,
                              Carlo
                              (Stata 19.0)

                              Comment

                              Working...
                              X