Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sequential partialling out of variables in linear regression: Is there such an algorithm and how is it called?

    Good afternoon,

    Suppose that my task is to obtain the residual vector in a multiple regression of Y on X and Z. Let's call this residual vector R. X and Z are two groups of variables.

    With experimentation where X and Z contain each only one variable, it seems to me that the following algorithm achieves the task:

    1. Regress Y on X, predict the residual vector R1, say.

    2. Regress R1 on Z, predict the residual vector R2.

    3. Regress R2 on X, predict the residual vector R3.

    4. Regress R3 on Z, predict the residual vector R4

    ... repeat the procedure until the Residual Sum of Squares does not change between this and next iteration.

    I did this manually with the auto data where Y is price, X is mpg and Z is headroom.

    With each next iteration the residual was getting closer and closer to the residual from the joint regression of price on mpg and headroom.

    My questions are:

    a) Is it just a coincidence that I obtained convergence, or such an algorithm exists?

    b) What is the name of this algorithm?

  • #2
    Here is an example of the algorithm converging:

    Code:
    . sysuse auto, clear
    (1978 Automobile Data)
    
    . 
    . qui reg price mpg headroom
    
    . 
    . predict double R, resid
    
    . 
    . qui reg price mpg
    
    . 
    . predict double R1, resid
    
    . 
    . local l = 1
    
    . 
    . while `l'<30 {
      2. 
    . qui reg R`l' headroom
      3. 
    . local ++l
      4. 
    . predict R`l', resid
      5. 
    . qui reg R`l' mpg
      6. 
    . local ++l
      7. 
    . predict R`l', resid
      8. 
    . }
    
    . 
    . summ R*
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
               R |         74   -1.35e-13     2592.89  -3041.126   9665.969
              R1 |         74   -2.58e-13    2605.621  -3184.174   9669.721
              R2 |         74   -.0000112    2595.075   -3043.89   9671.591
              R3 |         74    .0000281    2593.265  -3065.621   9666.611
              R4 |         74    1.78e-06    2592.955    -3041.6   9666.932
    -------------+---------------------------------------------------------
              R5 |         74   -2.96e-06    2592.901  -3045.321   9666.079
              R6 |         74    .0000198    2592.892  -3041.207   9666.134
              R7 |         74   -8.82e-06    2592.891  -3041.844   9665.987
              R8 |         74   -2.62e-06     2592.89   -3041.14   9665.997
              R9 |         74   -7.55e-06     2592.89  -3041.249   9665.972
    -------------+---------------------------------------------------------
             R10 |         74    .0000137     2592.89  -3041.129   9665.974
             R11 |         74   -.0000135     2592.89  -3041.147    9665.97
             R12 |         74    -.000011     2592.89  -3041.127    9665.97
             R13 |         74   -6.38e-07     2592.89   -3041.13   9665.969
             R14 |         74   -.0000184     2592.89  -3041.126   9665.969
    -------------+---------------------------------------------------------
             R15 |         74    3.81e-06     2592.89  -3041.127   9665.969
             R16 |         74   -.0000128     2592.89  -3041.126   9665.969
             R17 |         74   -2.12e-06     2592.89  -3041.126   9665.969
             R18 |         74   -.0000127     2592.89  -3041.126   9665.969
             R19 |         74    7.95e-06     2592.89  -3041.126   9665.969
    -------------+---------------------------------------------------------
             R20 |         74    5.92e-06     2592.89  -3041.126   9665.969
             R21 |         74    .0000121     2592.89  -3041.126   9665.969
             R22 |         74    9.97e-06     2592.89  -3041.126   9665.969
             R23 |         74    .0000113     2592.89  -3041.126   9665.969
             R24 |         74    8.45e-06     2592.89  -3041.126   9665.969
    -------------+---------------------------------------------------------
             R25 |         74    9.75e-06     2592.89  -3041.126   9665.969
             R26 |         74    7.13e-06     2592.89  -3041.126   9665.969
             R27 |         74    8.45e-06     2592.89  -3041.126   9665.969
             R28 |         74    6.11e-06     2592.89  -3041.126   9665.969
             R29 |         74    7.43e-06     2592.89  -3041.126   9665.969
    -------------+---------------------------------------------------------
             R30 |         74    5.15e-06     2592.89  -3041.126   9665.969
             R31 |         74    5.34e-06     2592.89  -3041.126   9665.969
    the RMSE of Rk and R became indistinguishable on the 8th iteration, and all the summary statistics of Rk and R became undistinguishable on the 14th iteration.

    Comment


    • #3
      And here is an example where X (= mpg trunk weight ) and Z (= headroom length turn displacement gear_ratio) are groups of variables:

      Code:
      . *** With more than 1 variable per group
      . 
      . sysuse auto, clear
      (1978 Automobile Data)
      
      . 
      . qui reg price mpg trunk weight headroom length turn displacement gear_ratio
      
      . 
      . predict double R, resid
      
      . 
      . qui reg price mpg trunk weight
      
      . 
      . predict double R1, resid
      
      . 
      . local l = 1
      
      . 
      . while `l'<200 {
        2. 
      . qui reg R`l' headroom length turn displacement gear_ratio
        3. 
      . local ++l
        4. 
      . predict R`l', resid
        5. 
      . qui reg R`l' mpg trunk weight
        6. 
      . local ++l
        7. 
      . predict R`l', resid
        8. 
      . }
      
      . 
      . summ R*
      
          Variable |        Obs        Mean    Std. Dev.       Min        Max
      -------------+---------------------------------------------------------
                 R |         74    1.66e-12    2128.312  -4306.459   6502.711
                R1 |         74    6.15e-13    2470.774  -3259.869   7190.119
                R2 |         74    1.24e-06    2262.016  -3738.815   7118.627
                R3 |         74    1.08e-06    2246.192  -3487.423   7398.882
                R4 |         74    4.97e-06    2237.652  -3487.533   7443.148
      --- OUTPUT OMITTED---
      
      
      -------------+---------------------------------------------------------
              R175 |         74   -5.14e-06    2128.316  -4291.977    6507.56
              R176 |         74    1.08e-06    2128.316  -4292.696   6508.607
              R177 |         74   -2.75e-06    2128.315   -4292.82   6507.277
              R178 |         74   -5.73e-06    2128.315  -4293.498   6508.263
              R179 |         74    9.15e-06    2128.315  -4293.615   6507.011
      -------------+---------------------------------------------------------
              R180 |         74    4.78e-06    2128.315  -4294.252    6507.94
              R181 |         74    4.68e-06    2128.315  -4294.363   6506.761
              R182 |         74   -.0000164    2128.315  -4294.963   6507.635
              R183 |         74    3.88e-06    2128.314  -4295.067   6506.525
              R184 |         74    8.60e-06    2128.314  -4295.633   6507.349
      -------------+---------------------------------------------------------
              R185 |         74   -1.06e-06    2128.314  -4295.731   6506.303
              R186 |         74    7.09e-07    2128.314  -4296.264   6507.078
              R187 |         74    2.59e-06    2128.314  -4296.356   6506.093
              R188 |         74   -3.65e-06    2128.314  -4296.857   6506.824
              R189 |         74   -9.12e-06    2128.314  -4296.944   6505.896
      -------------+---------------------------------------------------------
              R190 |         74    8.13e-06    2128.314  -4297.417   6506.584
              R191 |         74   -5.85e-06    2128.313  -4297.499   6505.711
              R192 |         74   -7.76e-06    2128.313  -4297.944   6506.359
              R193 |         74    6.80e-06    2128.313  -4298.021   6505.536
              R194 |         74   -3.87e-07    2128.313   -4298.44   6506.146
      -------------+---------------------------------------------------------
              R195 |         74   -9.09e-07    2128.313  -4298.513   6505.372
              R196 |         74   -3.76e-06    2128.313  -4298.907   6505.946
              R197 |         74   -4.27e-06    2128.313  -4298.976   6505.216
              R198 |         74    9.27e-06    2128.313  -4299.347   6505.757
              R199 |         74   -4.00e-06    2128.313  -4299.412    6505.07
      -------------+---------------------------------------------------------
              R200 |         74    2.64e-06    2128.313  -4299.762    6505.58
              R201 |         74    2.06e-07    2128.313  -4299.822   6504.933
      
      .
      Here I needed more iterations, but again the algorithm converged.

      Comment

      Working...
      X