Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata tip 118 - Orthogonalizing powered and product terms using residual centering

    For models with linear and squared terms, Stata tip 118 recommends regressing the square on the linear and then regressing the real Y on the linear and residuals of the first stage regression.


    I'm having trouble seeing how this does anything except fix the reported colinearity diagnostics. Here is what I'm running:

    clear
    set obs 2000
    g x=5 + runiform()
    g xx=x*x
    g y = x + xx + rnormal()
    reg xx x
    predict xxrc,resid

    reg y x xxrc
    reg y x xx


    I do get a different parameter on the main effect (which I should since I've changed the zero point) but it is generally further from the parameter in the simulated data. It is closer to the parameter on x from running regress y x. I get identical parameter estimates and standard errors on the squared term.

    What am I missing?

    Phil


  • #2
    I'm having trouble seeing how this does anything except fix the reported colinearity diagnostics.
    [..].
    What am I missing?
    I do not believe it is supposed to do anything else. Like mean-centering, such approaches seem pretty useless to me, as the underlying problem of collinearity is a lack of information. As neither approach adds information to the data, I wonder what the rational behind this might be? For how well Stata really deals with collinear data see Bill Gould's explanation.

    Best
    Daniel

    Comment


    • #3
      All of these models are equivalent, in the sense that they give you the same predicted values and same overall fit. One way to think of the differences is that they are just recentering the data to a more convenient? (statistically? computationally?) point, moving the zero in both the data space and the parameter space.

      Interesting to note that mean-centering and "residualizing" move you into two different coordinates bases, but leave you with the same statistical judgements.

      reg y x xxrc
      predict y1
      reg y x xx
      predict y2

      scatter y1 y2

      summarize x
      generate xc = x - r(mean)

      reg y c.xc##c.xc
      predict y3

      graph matrix y1 y2 y3, half
      Doug Hemken
      SSCC, Univ. of Wisc.-Madison

      Comment


      • #4
        One downside of residual centering is it makes it more difficult to interpret your model. Consider
        regress price c.weight##c.weight

        In the original units, the constant term tells you about the price where weight, and therefore, weight^2, are zero. Because the parameter estimate for the first order term depends on where zero is located, collinearity diagnostics depend on the location of zero as well.

        A nice thing about the above specification is that it allows us to construct post-estimation that is aware of the connection between weight and weight^2, for example with margins.

        margins, at(weight=(2000(100)5000))
        marginsplot


        If we center:
        summarize weight
        generate wc = weight - r(mean)
        regress price c.wc##c.wc


        our model is now expressed in deviation units of weight. The constant is now the price where wc and wc^2 are both zero, at the mean of weight. Our collinearity diagnostics now pertain to that point on the weight scale. We can still use margins gracefully. Notice that the plot is identical to the previous plot over the same range (rescaled to deviation units)

        margins, at(wc=(-1000(100)2000))
        marginsplot


        Now try the same thing with residual centering
        generate w2 = weight^2
        regress w2 weight
        predict w2dev, resid
        regress price weight w2dev


        Now we have broken the (easy) connection between the first order term and the second order term. One is scaled in original units, the second in a new kind of deviation unit: both are in dollars, but zero is in a different place on each scale. There is no point in the data space at which both weight=0 and w2dev=0. So what can the constant mean?

        Now in order to use margins, we have to do a convoluted translation between one scale and the other: in general, w2dev = weight^2-(-8866487+6153.254*weight). Weight and it's linear regression deviation units are still analytically connected, just not in an easily interpretable way. In fact, the w2dev term brings together a constant, a weight term, and a weight^2 term, all by virtue of being able to factor out a common coefficient. The constant from the (second) regression is whatever is left to balance all that. It perhaps has some nice mathematical meaning, but I'm not aware of what it is.

        This is the same awkwardness we run into when we try to standardize the coefficients in a model term-wise instead of variable-wise. We are making use of a legitimate linear transformation of the parameter space, but one that makes it hard to interpret the parameter space and the data space in the same terms. We have broken the tensored relation between the product term and its components.

        To me it seems like residual centering is a technical expedient that might sometimes solve computational difficulties (the final model is equivalent, after all). For intrepretation is seems like a hack, giving you pieces of information from what would typically be a couple of different interpretive models (again, equivalents). For use with latent interaction models, could be a different story?
        Doug Hemken
        SSCC, Univ. of Wisc.-Madison

        Comment

        Working...
        X