Squared Variable Omitted in estimation of prais-winsten and cochrane-orcutt

James Morrison

Join Date: Dec 2020

Posts: 6
#1

Squared Variable Omitted in estimation of prais-winsten and cochrane-orcutt

18 Dec 2020, 11:56

Hi! Ive estimated a model with OLS, where ive found autocorrelation with Durbin-Watson and Breusch-Godfrey. I want to use prais-winsten or cochrane-orcutt to remove the problem. When estimating the model, it removes a variable, which is a squared variable of another variable in the model.
I use the following code:
gen yearsquared=year*year
For OLS:
reg ftheft tfr partic degrees year yearsquared
For Cochrane-Orcutt:
prais ftheft tfr partic degrees year yearsquared, corc
For Prais-Winston:
prais ftheft tft partic degrees year yearsquared
In both cases, yearsquared gets removed from stata.
Any advices, as yearsquared shouldnt be removed due to collinearity?
Thanks!
Tags: None
Rich Goldstein

Join Date: Mar 2014

Posts: 4466
#2

18 Dec 2020, 12:15

center the variable by subtracting some value (mean, median, minimum) and then square; be sure you use the centered version as well as its square; actually, instead of making your own squared variable, you should use factor variable notation; see

Code:

help fvvarlist
Comment
James Morrison

Join Date: Dec 2020

Posts: 6
#3

18 Dec 2020, 12:35

Thanks for your answer. I tried to use factor variable notation, which works for OLS, but removes the squared variable again:

reg ftheft tfr partic degrees year c.year#c.year

Problem still exists, collinearity...
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35724

18 Dec 2020, 12:47

Check out this example and then compare your own results:

Code:

. clear

. set obs 40
number of observations (_N) was 0, now 40

. gen year = 1980 + _n

. gen yearsquared = year^2

. corr
(obs=40)

             |     year yearsq~d
-------------+------------------
        year |   1.0000
 yearsquared |   1.0000   1.0000


. di %23.18f r(rho)
   0.999996676678443497

The scatter plot will also be instructive.

To have a better chance of catching a quadratic relationship with time, work with say (year - constant) and its square.

.

Code:

 gen yearM2000 = year - 2000

. gen yearM2000sq  = (year - 2000)^2

. corr yearM*
(obs=40)

             | yea~2000 yearM2~q
-------------+------------------
   yearM2000 |   1.0000
 yearM2000sq |   0.0965   1.0000

except that as Rich advises you should use factor variable notation.

Comment

Jeff Wooldridge

Join Date: Apr 2014

Posts: 2173
#5

18 Dec 2020, 12:49

James: That is puzzling and should not happen in "regular" cases. This is a wild guess, but if Stata is trying to estimate rho = 1 then the intercept, year, and year^2 become collinear. If rho is tending toward unity then you probably need to think about differencing the entire equation.

Can you show your Stata output?
Comment
James Morrison

Join Date: Dec 2020

Posts: 6
#6

18 Dec 2020, 12:54

.

Last edited by James Morrison; 18 Dec 2020, 13:04.
Comment

James Morrison

Join Date: Dec 2020
Posts: 6

18 Dec 2020, 13:03

Code:

. prais ftheft tfr partic degrees year c.year#c.year
note: c.year#c.year omitted because of collinearity

Iteration 0:  rho = 0.0000
Iteration 1:  rho = 0.5056
Iteration 2:  rho = 0.6346
Iteration 3:  rho = 0.6941
Iteration 4:  rho = 0.7273
Iteration 5:  rho = 0.7475
Iteration 6:  rho = 0.7602
Iteration 7:  rho = 0.7685
Iteration 8:  rho = 0.7739
Iteration 9:  rho = 0.7775
Iteration 10:  rho = 0.7798
Iteration 11:  rho = 0.7814
Iteration 12:  rho = 0.7824
Iteration 13:  rho = 0.7831
Iteration 14:  rho = 0.7836
Iteration 15:  rho = 0.7839
Iteration 16:  rho = 0.7841
Iteration 17:  rho = 0.7843
Iteration 18:  rho = 0.7844
Iteration 19:  rho = 0.7844
Iteration 20:  rho = 0.7845
Iteration 21:  rho = 0.7845
Iteration 22:  rho = 0.7845
Iteration 23:  rho = 0.7845
Iteration 24:  rho = 0.7845
Iteration 25:  rho = 0.7845
Iteration 26:  rho = 0.7845
Iteration 27:  rho = 0.7845
Iteration 28:  rho = 0.7846
Iteration 29:  rho = 0.7846
Iteration 30:  rho = 0.7846

Prais-Winsten AR(1) regression -- iterated estimates

      Source |       SS           df       MS      Number of obs   =        34
-------------+----------------------------------   F(4, 29)        =     24.69
       Model |  867.486278         4   216.87157   Prob > F        =    0.0000
    Residual |  254.706178        29  8.78297164   R-squared       =    0.7730
-------------+----------------------------------   Adj R-squared   =    0.7417
       Total |  1122.19246        33   34.005832   Root MSE        =    2.9636

-------------------------------------------------------------------------------
       ftheft |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
          tfr |  -.0197856   .0045694    -4.33   0.000    -.0291311   -.0104401
       partic |    .026475   .0321516     0.82   0.417    -.0392823    .0922323
      degrees |   .0158855   .1151943     0.14   0.891    -.2197132    .2514842
         year |   1.353956   .3336136     4.06   0.000     .6716392    2.036272
              |
c.year#c.year |          0  (omitted)
              |
        _cons |  -2554.929   634.6958    -4.03   0.000    -3853.027    -1256.83
--------------+----------------------------------------------------------------
          rho |   .7845522
-------------------------------------------------------------------------------
Durbin-Watson statistic (original)    0.972504
Durbin-Watson statistic (transformed) 1.704736

Comment

Rich Goldstein

Join Date: Mar 2014

Posts: 4466
#8

18 Dec 2020, 13:42

I still see nothing about "centering" which was suggested in both #2 and #4

I note also that I know nothing about the other variables being used as you did not supply a -dataex- data example (please see the FAQ)
Comment

James Morrison

Join Date: Dec 2020
Posts: 6

18 Dec 2020, 13:52

Rich: Excuse me for not providing the data example, I´m new to the forum.

-----------------------

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input long(year tfr partic) double(degrees ftheft mtheft) float t
1935 2755 238 13.2 20.4 247.1  1
1936 2696 240 13.2 22.1 254.9  2
1937 2646 241 12.2 22.4 272.4  3
1938 2701 242 12.6 21.8 285.8  4
1939 2654 244 12.3 21.1 292.2  5
1940 2766 245   12 21.4   256  6
1941 2832 246 11.7 25.3 205.8  7
1942 2964 268 11.2 27.1   188  8
1943 3041 333 11.5   29 205.8  9
1944 3010 335 11.1 24.2 207.9 10
1945 3018 331 12.5 24.7 197.8 11
1946 3374 253   15 20.5 195.9 12
1947 3545 243 17.6 20.7 198.9 13
1948 3441 241 21.2 22.8 198.6 14
1949 3456 242 22.7 18.5 216.1 15
1950 3455 237 21.4 19.3   212 16
1951 3503 242 20.7   20   208 17
1952 3641 240 19.5 19.4 199.3 18
1953 3721 233 36.6 16.1 176.4 19
1954 3828 232 18.1 16.5 166.8 20
1955 3831 236 17.7 15.2 179.7 21
1956 3858 245 18.7 15.3 198.8 22
1957 3925 256 20.8 17.1 262.5 23
1958 3880 261 22.7 21.2 263.4 24
1959 3935 265 24.6 22.3 253.6 25
1960 3895 278 28.5 28.5 280.9 26
1961 3840 291 31.3 34.4 290.9 27
1962 3756 290   38   36 274.7 28
1963 3669 290   42 43.8 296.9 29
1964 3502 307 48.5 49.7 281.3 30
1965 3145 313 59.2 58.6 264.2 31
1966 2812 325 69.7 71.4 286.6 32
1967 2586 339 80.4 70.6   272 33
1968 2441 338 90.4   73 274.7 34
end

------------------

Comment

Rich Goldstein

Join Date: Mar 2014
Posts: 4466

#10

18 Dec 2020, 14:13

so, this is what I did:

Code:

gen yearc=year-1935
tsset yearc
. prais ftheft tfr partic degrees year c.yearc#c.yearc

Iteration 0:  rho = 0.0000
Iteration 1:  rho = 0.4779
Iteration 2:  rho = 0.6310
Iteration 3:  rho = 0.6964
Iteration 4:  rho = 0.7259
Iteration 5:  rho = 0.7392
Iteration 6:  rho = 0.7452
Iteration 7:  rho = 0.7479
Iteration 8:  rho = 0.7491
Iteration 9:  rho = 0.7496
Iteration 10:  rho = 0.7498
Iteration 11:  rho = 0.7499
Iteration 12:  rho = 0.7500
Iteration 13:  rho = 0.7500
Iteration 14:  rho = 0.7500
Iteration 15:  rho = 0.7500
Iteration 16:  rho = 0.7500
Iteration 17:  rho = 0.7500

Prais-Winsten AR(1) regression -- iterated estimates

      Source |       SS           df       MS      Number of obs   =        34
-------------+----------------------------------   F(5, 28)        =     26.80
       Model |  1061.34611         5  212.269222   Prob > F        =    0.0000
    Residual |  221.787888        28    7.920996   R-squared       =    0.8272
-------------+----------------------------------   Adj R-squared   =    0.7963
       Total |    1283.134        33  38.8828484   Root MSE        =    2.8144

------------------------------------------------------------------------------
      ftheft |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         tfr |  -.0154238     .00489    -3.15   0.004    -.0254405   -.0054072
      partic |   .0338571   .0307243     1.10   0.280    -.0290787     .096793
     degrees |  -.0392367    .114543    -0.34   0.734    -.2738673     .195394
        year |  -.1476686   .7891959    -0.19   0.853    -1.764263    1.468926
             |
     c.yearc#|
     c.yearc |   .0471802   .0228427     2.07   0.048     .0003891    .0939713
             |
       _cons |   342.5433   1518.686     0.23   0.823    -2768.344    3453.431
-------------+----------------------------------------------------------------
         rho |   .7500145
------------------------------------------------------------------------------
Durbin-Watson statistic (original)    1.036600
Durbin-Watson statistic (transformed) 1.581007

nothing omitted here

Comment

James Morrison

Join Date: Dec 2020

Posts: 6
#11

18 Dec 2020, 14:21

With the entering approach, nothing is omitted, yes.

But am I allowed to change a variable like this? The variable "year" and constant will therefore have different coefficients. Im supposed to create a model with the variables year and yearsq.

Really appreciate the help, thank you.
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4466
#12

18 Dec 2020, 14:27

I have no idea why you are "supposed" to do something; yes, centering changes the constant - without the centering, the constant is meaningless (assume year was your only predictor; without centering the constant is the mean when year = 0 - does anyone really care?)
1 like
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#13

18 Dec 2020, 18:47

Cross-posted at https://stats.stackexchange.com/ques...ochrane-orcutt

Our policy on cross-posting is explicit: you are asked to tell us about it. https://www.statalist.org/forums/help#crossposting
Comment

Joro Kolev

Join Date: Aug 2018
Posts: 3050

#14

18 Dec 2020, 19:15

To me it seems more puzzling why -regress- does not drop year^2.

Over this range of years the correlation of year and year^2 is 1.

Code:

. pwcorr year year2, sig obs

             |     year    year2
-------------+------------------
        year |   1.0000 
             |
             |       34
             |
       year2 |   1.0000   1.0000 
             |   0.0000
             |       34       34

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35724
#15

18 Dec 2020, 19:37

A quadratic is a quadratic; you are just parameterising it differently.-- with a side-effect on the intercept, which lacks inherent interest any way, so far as I can imagine.

Otherwise there is a spectrum of possibilities ranging from your teachers not being aware of this, or having forgotten that it would happen, through to them expecting that this would be one of the things you would need to work out.
Comment

Announcement