Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Can i ignore the multicollinearity?

    I want to ask about multicollinearity. I have panel data regression model with fixed effect model. The dependent variable is y, while x, x^2, v, w, and z as independent variables.
    x has a high correlation with v, more than 0.9. When I check the VIF, x and x^2 have high VIF, more than 10, but w, v, and z have low one, less than 10. Can I ignore the multicollinearity in this case?
    And should non-multicollinearity assumption be concerned/fulfilled in panel data? Thanks.

  • #2
    First of all, a high correlation between x and x^2 is normal and expected if the x variable does not substantially straddle zero. So it is no cause for concern, and it does not interfere with interpreting the relationship between y and x if you do it correctly.

    As for the correlation between x and v, it depends on a few things. First, what variables are you actually interested in studying the effect of? If neither x nor v is really of interest, and are just in the equation to adjust for their possible confounding effects on other variables' effects, then you can altogether ignore the correlation between them. It has no consequences.

    If x or v is a variable of actual interest, then you may have a problem. The way to tell if you have a problem is not by looking at the VIF's but by looking directly at the standard errors of x, x^2, and v. If those standard errors are sufficiently wide that you do not have adequate precision in your estimates of their coefficients, then there is a problem. But if the standard errors are narrow enough that you have sufficient precision for the variables of interest, then multicollinearity is not a problem and can be ignored.

    If x and v are both variables of interest and have excessively large standard errors, then you have a problem, one for which there is no entirely satisfactory solution One option would be to remove one of the variables from the equation, risking the introduction of missing variable bias, A second option is to go back and get a much larger data set to analyze so that despite the correlation, the standard errors are acceptably small--this is typicaly expensive, sometimes altogether infeasible. Finally, you can scrap your entire data set and redesign your study using matching or stratification or some other special sampling approach that breaks the correlation between x and v. I think the drawbacks of this last approach are obvious.

    Comment


    • #3
      Thanks, Mr. clyde. Well, can you tell where i can find the references (textbook, journal, etc) related to your answers?

      Comment


      • #4
        Actually, the variables of interest are all the independent variables i have mentioned before.
        Can you explain more about "large standard errors" consideration? How can i tell if it is large/wide, small/narrow?
        Standard error of each variables:
        const : 1.043212
        x : 0.174555
        x^2 : 0.015857
        v : 0.048220
        w : 1.22E-06
        z : 0.057865
        How's that?

        Comment


        • #5
          Anyone help me please...

          For addition informations, the results show significantly effect in 1% for all independent variables, and is appropriate with the theories. R-square= 99.8
          The model is FEM with weights 'cross-section SUR'.

          But I stuck to multicollinearity assumption.
          What do you think?

          Comment


          • #6
            Originally posted by Clyde Schechter View Post
            First of all, a high correlation between x and x^2 is normal and expected if the x variable does not substantially straddle zero. So it is no cause for concern, and it does not interfere with interpreting the relationship between y and x if you do it correctly.

            As for the correlation between x and v, it depends on a few things. First, what variables are you actually interested in studying the effect of? If neither x nor v is really of interest, and are just in the equation to adjust for their possible confounding effects on other variables' effects, then you can altogether ignore the correlation between them. It has no consequences.

            If x or v is a variable of actual interest, then you may have a problem. The way to tell if you have a problem is not by looking at the VIF's but by looking directly at the standard errors of x, x^2, and v. If those standard errors are sufficiently wide that you do not have adequate precision in your estimates of their coefficients, then there is a problem. But if the standard errors are narrow enough that you have sufficient precision for the variables of interest, then multicollinearity is not a problem and can be ignored.

            If x and v are both variables of interest and have excessively large standard errors, then you have a problem, one for which there is no entirely satisfactory solution One option would be to remove one of the variables from the equation, risking the introduction of missing variable bias, A second option is to go back and get a much larger data set to analyze so that despite the correlation, the standard errors are acceptably small--this is typicaly expensive, sometimes altogether infeasible. Finally, you can scrap your entire data set and redesign your study using matching or stratification or some other special sampling approach that breaks the correlation between x and v. I think the drawbacks of this last approach are obvious.
            Thanks, Mr. Clyde. Well, can you tell me where i can find the references (textbook, journal, etc) related to your answers?

            Comment


            • #7
              Our own Richard Williams has a nice summary of multicollinearity at http://www3.nd.edu/~rwilliam/stats2/l11.pdf.

              Comment


              • #8
                Op (please note the strong preference on this forum for real full names. Read the FAQ on how to re-registering accordingly. Thanks):
                you may also want to take a look at Paul Allison's primer on multiple regression (https://uk.sagepub.com/en-gb/eur/mul...ssion/book8989).
                As often recommended on this forum by Joao, A.S. Goldberg's textbook "A Course in econometrics" (https://www.amazon.it/Course-Econome...=1&*entries*=0) devotes chapter 23 to muticollinearity and related issues.
                Kind regards,
                Carlo
                (Stata 18.0 SE)

                Comment


                • #9
                  Originally posted by Clyde Schechter View Post
                  Our own Richard Williams has a nice summary of multicollinearity at http://www3.nd.edu/~rwilliam/stats2/l11.pdf.
                  Hi mr. Clyde,
                  Can you explain more about "large standard errors" consideration? How can i tell if it is large/wide, small/narrow?
                  Standard error of each variables:
                  const : 1.043212
                  x : 0.174555
                  x^2 : 0.015857
                  v : 0.048220
                  w : 1.22E-06
                  z : 0.057865

                  Comment


                  • #10
                    Standard errors on their own are neither large nor small. You need to compare them with the coefficients in order to decide whether they are large or small.

                    I suggest you look at the literature mentioned by Carlo to find out more.
                    ---------------------------------
                    Maarten L. Buis
                    University of Konstanz
                    Department of history and sociology
                    box 40
                    78457 Konstanz
                    Germany
                    http://www.maartenbuis.nl
                    ---------------------------------

                    Comment


                    • #11
                      I would go a little farther than Maarten. Standard errors are large or small relative to the coefficients, and also relative to the goals of your research. Let's say, for the sake of an example, that the coefficient of z is 1.5. Then relative to that coefficient, the standard error of 0.057865 is pretty small, and for most purposes you can probably rest easy. But if the nature of your research goal is such that you need an estimate of z that is precise to within 0.001, then that standard error is an order of magnitude too large. So it really boils down to a practical question: how precise does your estimate of the coefficient need to be?

                      Comment

                      Working...
                      X