No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • regress postestimation with weights

    I am running something like

    sysuse auto, clear
    reg mpg price foreign [aw=length]
    predict hat, hat
    The manual for regression postestimation says:
    After analytically weighted estimation, predict is willing to calculate only the prediction (no
    options), residual (residual option), standard error of the prediction (stdp option), and diagonal
    elements of the projection matrix (hat option). Moreover, the results produced by hat need to
    be adjusted, as will be described.
    and goes on to adjust the hat-values by the normalized weight. Does anybody know why this is necessary? That is,
    1. What is it exactly that predict, hat produces here?
    2. What is the correction that is being offered?
    I can imagine that one can think of the diagonal elements of either X(X'X)^{-1}X' or XW^{1/2}(X'WX)^{-1}W^{1/2}X matrices. Judging by the need to additionally adjust the results, Stata may be producing an unwieldy hybrid entries of X(X'WX)^{-1}X making it necessary to further multiply by the scaled weight.

    Has anybody looked at the issue? Can Stata Corp. please comment?
    -- Stas Kolenikov ||
    -- Principal Survey Scientist, Abt SRBI
    -- Opinions stated in this post are mine only

  • #2
    Let me address the two questions separately.

    1. What is it exactly that predict, hat produces here?
    When weights are specified with regress, the hat prediction for observation j is computed as

    h_j = x_j (X'DX)^{-1} x_j'

    where D is a diagonal matrix with the weights on the diagonal. We admit that this is not clear from the Methods and formulas in [R] regress postestimation, and we plan modify the documentation so that the weighted version of this prediction is clearly defined.

    This is the same formula used when fweights, aweights, and iweights are specified. predict, hat is not allowed after estimation with pweights.

    2. What is the correction that is being offered?
    After revisiting this technical note, we believe that the importance of the adjustment is overstated. Whether you want to multiply the predicted values by the normalized weights will depend on how you are planning to use the predictions. You may indeed prefer the values that are produced by predict, hat directly.

    Predictions, in general, are computed using the values of variables in a given observation. This does not change when weights were used in estimation. The prediction given by predict, hat, without adjustment, is the leverage, the diagonal element of the projection matrix, for the corresponding observation. Its value does not change based on weights. We could, however, think of the weight as measuring the contribution of this prediction. In fact, if we were graphing the leverage values, we would graph the values produced by predict, hat but would adjust the size of the plotted points based on the weights to emphasize the contribution.

    To be honest, we do not have any recommendations for how you may want to use this hat prediction after estimation with aweights. As mentioned in this same technical note, many types predictions are not available because they are not well defined in the case of awieghts. These include the standard error of the forecast, standard error of the residuals, standardized and studentized residuals, and Cook's D, all of which are functions of hj. Although, we don't have suggested uses for this prediction, the hat prediction is available because the computation requires an N*N matrix and we can compute this internally in a more efficient manner than a user would be able to do manually.

    Having said that, if someone wanted to use these predictions in a subsequent calculation and draw conclusions in the context of the population, we could certainly see that you may want to weight them before using them further. The adjustment discussed in the technical note shows how to compute the normalized weights in this situation.


    • #3
      Thanks, Kristin. Richard Valliant had two or three papers in the past five or so years about regression diagnostics with weighted data, which you might want to check out.
      -- Stas Kolenikov ||
      -- Principal Survey Scientist, Abt SRBI
      -- Opinions stated in this post are mine only