Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • NEED URGENT HELP. Multicolinearity despite significant results

    . Hello, I am currently writing my Bachelor thesis and a complete Stata/Statistics beginner.

    My task was to replicate an easy multivariate regression using the reghdfe command. I get satisfactory results which are significant despite the presence of a lot of control variables. However I just stumbled upon the subject of multicolinearity and checked for it using vif uncentered. And some of my controling variables have ViFs above 30, my main variable of interested has one of 15. Is that an issue? since from what I understood the problem with multicolinearity is that it makes variables insignificant but that isn't the case for my main predictor. How do I deal with this? I am not allowed to change the regression model due to the fact that I am repicating/ confirming another paper. The authors of the original paper I am replicating do not adress multicolinearity at all. The aim of their paper is to prove a causal relationship between a varibale and stock market reactions. please help, my thesis is due in a couple of days.
    Last edited by Nils Nehmer; 17 Apr 2024, 08:40.

  • #2
    There are already very good discussions about panel data and multicollinearity problems on this board. These should help you understand the nature of the issue and how to move forward. For example, see here and here.

    Comment


    • #3
      Hey Erik, thank you for your post, I read some of the posts and part of the book one recommend. What I understood is that people worry too much about multicolinearity, since its just another word for a too small sample size? I personally do not care about how precise my estimate is, at lon as its significant and positive, does that mean i can ignore the issue of multicolinearity?

      Comment


      • #4
        In a lot of cases, the worry about multicollinearity is overblown. The key is to look at the standard errors of the coefficient estimates. If those start to get very large (many multiples of the parameter estimate), then you likely have a problem with two or more predictors being highly correlated. In that case, remove one of those variables to see if that fixes things.

        Comment


        • #5
          I personally do not care about how precise my estimate is, at lon [sic] as its significant and positive
          This sentence contradicts itself.

          If you had the same estimate but derived it from data that gave it less precision, it very well might not be "significant." If you had come up with a non-"significant" result, that would be another way of saying that it is to imprecise to say whether the data are also compatible with a zero result. So, you ought to care a great deal about how precise your estimate is.

          Comment


          • #6
            Cross-posted to reddit. Please note you are requested (some might say required) to let us know if you have cross-posed this elsewhere to avoid potential duplication of efforts. Likewise, it is a request on r/stata that you do the same.

            Comment


            • #7
              Hey, sorry I didn't knew that cross posting is not okay

              Comment


              • #8
                You can cross-post, but the FAQ lays out that you should tell us (and provide a link).

                Comment


                • #9
                  @clyde The goal of my regression is to prove a positive relationship and not necessarily whether an increase in the variable changes the dependent variable by 2% or 3%. But I gotta admit that I dont really understand what you are trying to say.

                  Comment


                  • #10
                    @Leonardo, thank you for informing me about that, I will do so looking forward

                    Comment


                    • #11
                      Clyde's point is that if you want to determine whether the association is positive, negative, or whatever, then you should care about precision. In a regression model, the standard error of the coefficient gives you a sense of the precision of the coefficient estimate. The smaller the standard error, the more precise the coefficient estimate is. All else equal, many folks want their standard error to be small enough such that they can utilize a test statitistic (for example, a z- or t-value) to determine whether they can reject the null hypothesis that the coefficient is zero ("statistical significance"). With imprecise coefficient estimates (read large standard errors), one is unlikely to be able to reject the null hypothesis that their coefficient is different from 0.

                      Comment


                      • #12
                        Oh okay thank you I get it now. I have an estimate of 1,69% with a standard error of 0.5%. p value is is 0.002 and t= 3.07. Is multicolinearity an issue for me then? Only thing I am understanding is that high multicolineearity is bad because it inflates your standard error but it seems like this isn't the case here or is it?. I have high multicolinearity (Vif= 15 for the estimate I care about)

                        Comment


                        • #13
                          Just to comment obliquely that for your next thread your urgency, although no doubt quite real, is not our problem. This too is flagged in the FAQ Advice you didn't read before posting.

                          Comment


                          • #14
                            Oh okay thank you I get it now. I have an estimate of 1,69% with a standard error of 0.5%. p value is is 0.002 and t= 3.07. Is multicolinearity an issue for me then? Only thing I am understanding is that high multicolineearity is bad because it inflates your standard error but it seems like this isn't the case here or is it?. I have high multicolinearity (Vif= 15 for the estimate I care about)
                            Yes, the adverse effect of multicolinearity is inflation of the standard error. How much it is depends on your goals. You have stated your goal, indirectly, but quite definitively, in #3. You regard any result suggestion the data are not supportive of a 0 or negative value is a "success" for you. Your 95% CI will be 1.69 +- 1.96*0.5, which is 0.98 to 2.67. That interval excludes 0, and does so fairly comfortably. So your results give a satisfactory answer to the question you have raised. Thus there is no multicolinearity problem here.

                            Have your standard error, with the same estimate, been 1.0% instead of 0.5%, however your confidence interval would have been -.27 to 3.65. That would mean that the data are consistent with values ranging between 0 and -.27. But also consistent with positive values up to 3.65. So in that case your study would be inconclusive with respect to your specific question, and that would, indeed, be a multicolinearity problem.

                            I would say more generally that the VIF statistics are of limited utility for deciding whether multicolinearity is a problem. Where they come in handy is the situation where you get a result with insufficient precision (e.g. a confidence interval that extends into territory on both sides of whatever number delimits your question's implied threshold of meaningful difference--0 in your case, but could be anything in other situations), and you want to know whether multicolinearity might be a cause of that. In that case, seeing that the key variable has a high VIF does implicate multicolinearity, whereas a low VIF would say that the indeterminacy of the result has other causes.

                            Of course, the other thing to remember is that when you do have a multicolinearity problem, your study is basically a lost cause. Because the only solutions involve either gathering a much larger data set (multicolinearity is really just a synonym of hyponumerosity, after all) or a radically different design such as matched pairs that can break the multicolinearity.

                            Comment

                            Working...
                            X