Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to deal with multicollinearity in a Translog Production Function

    What are the different methods for dealing with multicollinearity in a translog production function?

    I have seen several methods such as:

    1. Checking for variance inflation factor (VIF) and ensuring that it is less than 10 therefore, if VIF > 10, eliminate the variables in a step-wise way?

    2. Maintain either the squares or the cross products depending on which fits data best. However, this might not be useful since most of the time the full model is a better fit.

    3. Standardize the variables by the mean and estimating again. If there are still VIF>10, eliminate step-wise by VIF?

    How do I deal with the issue of multicollinearity in my dataset?
    I know that translog is a better fit than Cobb-Douglas in my data but am faced with the multicollinearity challenge. What would be a way forward?

  • #2
    Multicollinearity is not a problem, it is just a description of the state of your data. It the correlations are high, then it will be hard to disentangle the affected variables and the standard errors are (and should be) high. Normally that is all there is to say.

    There are some special cases:
    polynomials. You can reduce multicolinearity by centering your variable before computing the square, cube, etc.
    perfect multiconlinearity: this usually indicates a mistake in thinking about your model. For example if I were to include age period and cohort I would forget that if you know two, the third adds no information whatsoever, as I could just compute the third directly from the first two.

    There are sampling designs that let you deal with multicolineartiy, but if the data has already been collected then you are too late to use that method.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      To add on to what Marten is saying: The high correlations among the terms in a translog results in an inability to estimate the coefficients precisely. But that's because, unless centering has been done -- see Marten's comment -- the coefficients themselves are pretty meaningless. For example, if b is the coefficient on log(L), then b is the elasticity of Q with respect to L when L = 1 and K = 1 (in the two input case). This is almost never interesting, and nor should one expect to estimate it at all well. Instead, use the
      Code:
      margins
      command to obtain, at a minimum, average elasticities. Or, use
      Code:
      margins
      evaluated at meaningful values of L and K (and other inputs). My guess is you will find fairly precise estimates. And, if you use averages, these may not differ much from the Cobb-Douglas estimates.

      Using the VIFs to guide you is a big mistake. Define what parameters you're interested in estimating and then obtain standard errors. You will see that collinearity is not much of a problem when you do that.

      Comment


      • #4
        Thank you Maarten and Jeff for the clear explanation. Glad you have highlighted the issue with VIF as I felt there was something amiss with that method. Since I have squares and cross products, I will try centering and see what output I get.

        Comment


        • #5
          I had not realized this before but when centering on means, I do get negative values as expected. The problem with this is when am running a stochastic frontier model, it does not estimate properly. Are there alternatives or how does some deal with this?

          Comment

          Working...
          X