Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Squared Variable Omitted in estimation of prais-winsten and cochrane-orcutt

    Hi! Ive estimated a model with OLS, where ive found autocorrelation with Durbin-Watson and Breusch-Godfrey. I want to use prais-winsten or cochrane-orcutt to remove the problem. When estimating the model, it removes a variable, which is a squared variable of another variable in the model.
    I use the following code:
    gen yearsquared=year*year
    For OLS:
    reg ftheft tfr partic degrees year yearsquared
    For Cochrane-Orcutt:
    prais ftheft tfr partic degrees year yearsquared, corc
    For Prais-Winston:
    prais ftheft tft partic degrees year yearsquared
    In both cases, yearsquared gets removed from stata.
    Any advices, as yearsquared shouldnt be removed due to collinearity?
    Thanks!


  • #2
    center the variable by subtracting some value (mean, median, minimum) and then square; be sure you use the centered version as well as its square; actually, instead of making your own squared variable, you should use factor variable notation; see
    Code:
    help fvvarlist

    Comment


    • #3
      Thanks for your answer. I tried to use factor variable notation, which works for OLS, but removes the squared variable again:

      reg ftheft tfr partic degrees year c.year#c.year

      Problem still exists, collinearity...

      Comment


      • #4
        Check out this example and then compare your own results:

        Code:
        . clear
        
        . set obs 40
        number of observations (_N) was 0, now 40
        
        . gen year = 1980 + _n
        
        . gen yearsquared = year^2
        
        . corr
        (obs=40)
        
                     |     year yearsq~d
        -------------+------------------
                year |   1.0000
         yearsquared |   1.0000   1.0000
        
        
        . di %23.18f r(rho)
           0.999996676678443497
        The scatter plot will also be instructive.

        To have a better chance of catching a quadratic relationship with time, work with say (year - constant) and its square.


        .
        Code:
         gen yearM2000 = year - 2000
        
        . gen yearM2000sq  = (year - 2000)^2
        
        . corr yearM*
        (obs=40)
        
                     | yea~2000 yearM2~q
        -------------+------------------
           yearM2000 |   1.0000
         yearM2000sq |   0.0965   1.0000
        except that as Rich advises you should use factor variable notation.



        Comment


        • #5
          James: That is puzzling and should not happen in "regular" cases. This is a wild guess, but if Stata is trying to estimate rho = 1 then the intercept, year, and year^2 become collinear. If rho is tending toward unity then you probably need to think about differencing the entire equation.

          Can you show your Stata output?

          Comment


          • #6
            .
            Last edited by James Morrison; 18 Dec 2020, 13:04.

            Comment


            • #7
              Code:
              . prais ftheft tfr partic degrees year c.year#c.year
              note: c.year#c.year omitted because of collinearity
              
              Iteration 0:  rho = 0.0000
              Iteration 1:  rho = 0.5056
              Iteration 2:  rho = 0.6346
              Iteration 3:  rho = 0.6941
              Iteration 4:  rho = 0.7273
              Iteration 5:  rho = 0.7475
              Iteration 6:  rho = 0.7602
              Iteration 7:  rho = 0.7685
              Iteration 8:  rho = 0.7739
              Iteration 9:  rho = 0.7775
              Iteration 10:  rho = 0.7798
              Iteration 11:  rho = 0.7814
              Iteration 12:  rho = 0.7824
              Iteration 13:  rho = 0.7831
              Iteration 14:  rho = 0.7836
              Iteration 15:  rho = 0.7839
              Iteration 16:  rho = 0.7841
              Iteration 17:  rho = 0.7843
              Iteration 18:  rho = 0.7844
              Iteration 19:  rho = 0.7844
              Iteration 20:  rho = 0.7845
              Iteration 21:  rho = 0.7845
              Iteration 22:  rho = 0.7845
              Iteration 23:  rho = 0.7845
              Iteration 24:  rho = 0.7845
              Iteration 25:  rho = 0.7845
              Iteration 26:  rho = 0.7845
              Iteration 27:  rho = 0.7845
              Iteration 28:  rho = 0.7846
              Iteration 29:  rho = 0.7846
              Iteration 30:  rho = 0.7846
              
              Prais-Winsten AR(1) regression -- iterated estimates
              
                    Source |       SS           df       MS      Number of obs   =        34
              -------------+----------------------------------   F(4, 29)        =     24.69
                     Model |  867.486278         4   216.87157   Prob > F        =    0.0000
                  Residual |  254.706178        29  8.78297164   R-squared       =    0.7730
              -------------+----------------------------------   Adj R-squared   =    0.7417
                     Total |  1122.19246        33   34.005832   Root MSE        =    2.9636
              
              -------------------------------------------------------------------------------
                     ftheft |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
              --------------+----------------------------------------------------------------
                        tfr |  -.0197856   .0045694    -4.33   0.000    -.0291311   -.0104401
                     partic |    .026475   .0321516     0.82   0.417    -.0392823    .0922323
                    degrees |   .0158855   .1151943     0.14   0.891    -.2197132    .2514842
                       year |   1.353956   .3336136     4.06   0.000     .6716392    2.036272
                            |
              c.year#c.year |          0  (omitted)
                            |
                      _cons |  -2554.929   634.6958    -4.03   0.000    -3853.027    -1256.83
              --------------+----------------------------------------------------------------
                        rho |   .7845522
              -------------------------------------------------------------------------------
              Durbin-Watson statistic (original)    0.972504
              Durbin-Watson statistic (transformed) 1.704736

              Comment


              • #8
                I still see nothing about "centering" which was suggested in both #2 and #4

                I note also that I know nothing about the other variables being used as you did not supply a -dataex- data example (please see the FAQ)

                Comment


                • #9
                  Rich: Excuse me for not providing the data example, I´m new to the forum.

                  -----------------------
                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input long(year tfr partic) double(degrees ftheft mtheft) float t
                  1935 2755 238 13.2 20.4 247.1  1
                  1936 2696 240 13.2 22.1 254.9  2
                  1937 2646 241 12.2 22.4 272.4  3
                  1938 2701 242 12.6 21.8 285.8  4
                  1939 2654 244 12.3 21.1 292.2  5
                  1940 2766 245   12 21.4   256  6
                  1941 2832 246 11.7 25.3 205.8  7
                  1942 2964 268 11.2 27.1   188  8
                  1943 3041 333 11.5   29 205.8  9
                  1944 3010 335 11.1 24.2 207.9 10
                  1945 3018 331 12.5 24.7 197.8 11
                  1946 3374 253   15 20.5 195.9 12
                  1947 3545 243 17.6 20.7 198.9 13
                  1948 3441 241 21.2 22.8 198.6 14
                  1949 3456 242 22.7 18.5 216.1 15
                  1950 3455 237 21.4 19.3   212 16
                  1951 3503 242 20.7   20   208 17
                  1952 3641 240 19.5 19.4 199.3 18
                  1953 3721 233 36.6 16.1 176.4 19
                  1954 3828 232 18.1 16.5 166.8 20
                  1955 3831 236 17.7 15.2 179.7 21
                  1956 3858 245 18.7 15.3 198.8 22
                  1957 3925 256 20.8 17.1 262.5 23
                  1958 3880 261 22.7 21.2 263.4 24
                  1959 3935 265 24.6 22.3 253.6 25
                  1960 3895 278 28.5 28.5 280.9 26
                  1961 3840 291 31.3 34.4 290.9 27
                  1962 3756 290   38   36 274.7 28
                  1963 3669 290   42 43.8 296.9 29
                  1964 3502 307 48.5 49.7 281.3 30
                  1965 3145 313 59.2 58.6 264.2 31
                  1966 2812 325 69.7 71.4 286.6 32
                  1967 2586 339 80.4 70.6   272 33
                  1968 2441 338 90.4   73 274.7 34
                  end
                  ------------------

                  Comment


                  • #10
                    so, this is what I did:
                    Code:
                    gen yearc=year-1935
                    tsset yearc
                    . prais ftheft tfr partic degrees year c.yearc#c.yearc
                    
                    Iteration 0:  rho = 0.0000
                    Iteration 1:  rho = 0.4779
                    Iteration 2:  rho = 0.6310
                    Iteration 3:  rho = 0.6964
                    Iteration 4:  rho = 0.7259
                    Iteration 5:  rho = 0.7392
                    Iteration 6:  rho = 0.7452
                    Iteration 7:  rho = 0.7479
                    Iteration 8:  rho = 0.7491
                    Iteration 9:  rho = 0.7496
                    Iteration 10:  rho = 0.7498
                    Iteration 11:  rho = 0.7499
                    Iteration 12:  rho = 0.7500
                    Iteration 13:  rho = 0.7500
                    Iteration 14:  rho = 0.7500
                    Iteration 15:  rho = 0.7500
                    Iteration 16:  rho = 0.7500
                    Iteration 17:  rho = 0.7500
                    
                    Prais-Winsten AR(1) regression -- iterated estimates
                    
                          Source |       SS           df       MS      Number of obs   =        34
                    -------------+----------------------------------   F(5, 28)        =     26.80
                           Model |  1061.34611         5  212.269222   Prob > F        =    0.0000
                        Residual |  221.787888        28    7.920996   R-squared       =    0.8272
                    -------------+----------------------------------   Adj R-squared   =    0.7963
                           Total |    1283.134        33  38.8828484   Root MSE        =    2.8144
                    
                    ------------------------------------------------------------------------------
                          ftheft |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                             tfr |  -.0154238     .00489    -3.15   0.004    -.0254405   -.0054072
                          partic |   .0338571   .0307243     1.10   0.280    -.0290787     .096793
                         degrees |  -.0392367    .114543    -0.34   0.734    -.2738673     .195394
                            year |  -.1476686   .7891959    -0.19   0.853    -1.764263    1.468926
                                 |
                         c.yearc#|
                         c.yearc |   .0471802   .0228427     2.07   0.048     .0003891    .0939713
                                 |
                           _cons |   342.5433   1518.686     0.23   0.823    -2768.344    3453.431
                    -------------+----------------------------------------------------------------
                             rho |   .7500145
                    ------------------------------------------------------------------------------
                    Durbin-Watson statistic (original)    1.036600
                    Durbin-Watson statistic (transformed) 1.581007
                    nothing omitted here

                    Comment


                    • #11
                      With the entering approach, nothing is omitted, yes.

                      But am I allowed to change a variable like this? The variable "year" and constant will therefore have different coefficients. Im supposed to create a model with the variables year and yearsq.

                      Really appreciate the help, thank you.

                      Comment


                      • #12
                        I have no idea why you are "supposed" to do something; yes, centering changes the constant - without the centering, the constant is meaningless (assume year was your only predictor; without centering the constant is the mean when year = 0 - does anyone really care?)

                        Comment


                        • #13
                          Cross-posted at https://stats.stackexchange.com/ques...ochrane-orcutt

                          Our policy on cross-posting is explicit: you are asked to tell us about it. https://www.statalist.org/forums/help#crossposting

                          Comment


                          • #14
                            To me it seems more puzzling why -regress- does not drop year^2.

                            Over this range of years the correlation of year and year^2 is 1.

                            Code:
                            . pwcorr year year2, sig obs
                            
                                         |     year    year2
                            -------------+------------------
                                    year |   1.0000 
                                         |
                                         |       34
                                         |
                                   year2 |   1.0000   1.0000 
                                         |   0.0000
                                         |       34       34

                            Comment


                            • #15
                              A quadratic is a quadratic; you are just parameterising it differently.-- with a side-effect on the intercept, which lacks inherent interest any way, so far as I can imagine.

                              Otherwise there is a spectrum of possibilities ranging from your teachers not being aware of this, or having forgotten that it would happen, through to them expecting that this would be one of the things you would need to work out.

                              Comment

                              Working...
                              X