Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • estat vce, corr and vif

    I apologise in advance if this has been posted before. Though I couldn't seem to find a discussion when I searched the forum.
    I am using estat vce, corr and it appears that my two main independent variables of interest have a correlation of 0.68.
    I realise when using vif a value over 10 is problematic, though I was wondering what the cut off is using estat vce, corr.
    Also, in relation to this, models with year or dummies, age and age^2 controls for example would have high correlation as one would imagine.
    How does one deal with that sort of collinearity?
    Thanks

  • #2
    first, I don't agree that a vif over 10 is problematic (nor do I know where you got that cutoff) - actually, I don't think that vif is useful at all; if you have enough collinearity to cause problems then Stata will drop (at least) one variable

    second, for polynomials, you can always "center" and then square (etc.) the centered variable; what value you center at is generally not important (could be anything from the minimum value (which means no negative values which is important for some readers) to the mean or median of the distribution)

    Comment


    • #3
      Thanks Rich, though I somewhat disagree with Stata 'always' dropping problem variables. Sometimes coefficients flip signs due to collinearity issues, though Stata has still included them in the analysis. In other words collinearity would still be a cause for concern in these cases.

      Comment


      • #4
        Janet:
        Stata drops variables when there's an extreme multicollinearity issue.
        When a quasi-extreme multicollinearity issue creeps up, Stata reports "weird" standard errors and confidence intervals.
        See also Chapter 23 of Goldberg's textbook on econometrics https://www.hup.harvard.edu/catalog....40&content=toc
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          let me clarify: first, Goldberg's test goes too far if N is <1000 (and way too far for N<100) - my personal opinion; second, the results from the user-written coldiag2 are much more useful (at least to me) than the results of -vif-; third, please post some example data, using -dataex- and results (in CODE blocks) where the sign changes and please include your explanation of why you think this is due to collinearity (rather than something else)

          Comment


          • #6
            Thanks for the clarification Rich Goldstein and Carlo Lazzaro, I appreciate it a lot! I have centred my age dummies and that seemed to do the trick, no more oddly large standard errors. On that note, I would like to present a table of correlations between 2 of my variables. Something like the the matrix that follows Table A2 (page 33 of the text) in this paper: http://ftp.iza.org/dp9156.pdf

            Any ideas on how to do this? I am confused as the table includes standard errors for each model (this usually doesn't show up in the commands I normally use - estat vce, corr, or coldiag2). It also includes the correlations between the dependent variable. Thanks

            Comment


            • #7
              Dear Mr. Carlo Lazzaro
              I am regressing CO2 emissions per capita on log real GDP per capita (ln_GDPc) for a sample of 26 countries over 13 years. Some panel unit root tests show that ln_GDPc is stationary level, whereas other show that it is stationary in first difference.
              When I include ln_GDPc in level, the estimated coefficient is 0.034. However, when I include ln_GDPc in first difference, the estimated coefficient becomes 0.890.
              My question is: why do I have relatively very high coefficient if I take the first difference of ln_GDPc?
              Would you please clarify that to me?

              Comment

              Working...
              X