Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • High R-sqaured, Low Variable Significance

    Hi everyone!
    I am currently working on a project that used secondary data and struggle a bit with the interpretation of the results
    Click image for larger version

Name:	Picture 1.png
Views:	1
Size:	156.4 KB
ID:	1600062

    How I see it: This model fits the data well at the .05 significance level (F=15.77 and p<.0000). R2 of .8184 says that this model accounts for 81.84% of the total variance for the dependent variable.
    However, the variables included have no significance in explaining the dependent variable except for egov_hci. How is this aligned/ works with the overall fit of the model?
    Does this mean that the variables are suitable for explaining the variation but insignificant in doing so, at least for the timeframe of the current group?
    Also, the two-way time fixed effect is quite hard to interpret but relevant for the present study. Any pointer on what is important to report?

    Thank you in advance for any insights you may be able to offer me.

  • #2
    The only interesting thing about the R-squared is that it is not interesting at all. (Unless you are doing a forecasting exercise.)

    Your R-squared is telling you the joint significance of all regressors, including the time dummies. Time dummies are highly significant as you can see.

    R-squared has standard meaning only in OLS regression.

    Overall I would not worry about the R-squared in your case.



    Comment


    • #3
      If two variables are highly correlated, and either one will explain the dependent variable variation, then the R-square will be high, even though neither achieves significance when the other is also present in the regression. When both are present, a change in the value of one coefficient can be compensated for by in the coefficient of the other without much sacrifice of fit.. Threfore the standard error on each of those variables is large.

      Comment


      • #4
        Teodora:
        as an aside to previous excellent replies, under -fe- the only R-sq to take a look at is the within one.
        In your case it is apparently high, whereas your your time-unrelated predictors are not.
        The most likely diagnosis is an underlying quasi-extreme multicollinearity: whethere this is (or not) a problem, depends on the predictors you're more interested in investigating.
        Chapther 23 of https://www.hup.harvard.edu/catalog....=9780674175440 covers multicollinearity in issue in an skillful and humoristic way.
        Eventually, I would check your model for possibe misspecification.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment

        Working...
        X