Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference between R2 measures with "xtreg"- and with "reg"-command

    Dear Community,

    for my master thesis I am conducting a Difference-in-Differences-Analysis. More specifically, I am investigating if a policy change at the beginning of the year 2017 has a treatment effect on my treatment group.

    I am using panel data with a total of 204 firms over a period of 5 years (2015 - 2019). I constructed a balanced sample with a control and treatment group of equal size (each group has 102 firms in it). The years 2015 and 2016 constitute my Pre-Period, while the years 2017-2019 belong to the Post-Period.

    In order to arrive at my DiD estimator I estimated the following regression using a two-way fixed effects model:

    Code:
     xtreg y TreatPost x1 x2 ... xk Yr2-Yr5, fe i(ID)
    • Yr2-Yr5 represent Dummy-Variables in order to capture the year fixed effects
    • The option fe i(ID) are the firm fixed effects, where ID is the identifier for my firms
    With these settings I arrive at the following R2. I read through other posts concerning the interpretation of the three R2 measures. As far as I understood, the within R2 shows the amount of variability in my dependant variable that can be explained by my explanatory variables within each unit (or firm in my case). On the other hand the between R2 shows me how much of the variability between indivdual firms can be explained.
    Click image for larger version

Name:	Screenshot 2021-07-23 001138.png
Views:	1
Size:	2.3 KB
ID:	1620238



    I was a worried about the relatively low R2, which is why I wanted to cross-check the results with a "normal" regression using the following command:

    Code:
     reg y TreatPost x1 x2 ... xk i.year i.ID
    As far as I know this code does the exact same thing as the xtreg-command above, but simply does it manually. The i.year-command generates four dummy variables controlling for the year fixed effects, while the command i.ID generates a Dummy for each of the firms in my sample, thus controlling for the firm fixed effects.

    As expected, both codes give me the exact same coefficients for my explanatory variables as well as the same standard errors and signifiance levels. So far, so good, however, with the second command ("normal" reg) I arrive at a much higher R2.
    Click image for larger version

Name:	Screenshot 2021-07-23 002429.png
Views:	1
Size:	4.1 KB
ID:	1620239



    Therefore, I am a bit confused why the R2 in the second regression is so much higher than in the first one. My guess is that the xtreg-command only takes into account the explanatory power of the "real" regressors (without the fixed effects), while the reg-command takes also into account the explanatory power of the fixed effects....however, I am quite unsure about such an interpretation.


    It would be great if someone could enlighten me and tell me the differences between the R2 measures and which one should be reported in my thesis.

    Thank you very much in advance!

  • #2
    Look at the PDF manual entry of xtreg under Assessing goodness of fit for descriptions and calculations of the statistics.

    Code:
    help xtreg

    Comment


    • #3
      Also, bear in mind that the R2 for the OLS regression includes variance explained by i.ID, whereas none of the R2 statistics coming from -xtreg- do. This is not even remotely an apples to apples comparison.

      Comment


      • #4
        Apart from the useful suggestions by Andrew and Clyde:

        1. A great econometrician (probably Professor Jeff Wooldridge , or if not, the late Professor Goldberger) wrote that "The only interesting thing about the R-squared, is that there is nothing interesting about the R-squared." (I am citing from memory, and have not been able to locate the original source. Therefore I copied Professor Wooldridge in, so that he can correct me if I am misrepresenting his views on the matter.)

        2. The R-squared has the usual meaning of decomposition of variance only in standard OLS regression, so these R-squares that we are fabricating in IV and Fixed Effects regressions do not have the same meaning.

        3. If you do not want to go too deep in the forumlas, a simple way how to interpret your R-squared correctly is not to try and give it absolute meaning, but simply compare it to a published study using the same design as you, and reporting an R-squared.

        Comment


        • #5
          Thank you very much for the quick and useful advices.

          I will have to think about the exact meaning of each of the R2 measures (and the choice, which one to display in my thesis) once again, however I already received a feeling for where the differences lie.

          Thanks a lot again!

          Comment

          Working...
          X