Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Different results when using reg/reghdfe compared to xtreg

    Dear Statalisters,
    I'm running a model with fixed effects using reg, reghdfe and xtreg. Basically reg and reghdfe give me almost identical results except for the fact that results are less significant in reghdfe. But when I use xtreg, the results are dramatically different. One of the coefficient even has a different sign and it looks problematic. The codes I use to produce the results are presented below.
    For reg I have:
    Code:
    reg hours_worked $control i.year i.industry1 i.level1, vce(robust)
    Similar to this I have the following for reghdfe
    Code:
    reghdfe hours_worked $control , vce(cluster statefips) abs(year industry1 level1)
    For xtreg I have:
    Code:
    xtreg hours_worked $Control i.year i.industry1 i.level1, fe vce(robust)
    For reg the results arecoef given by reghdfe are identical)
    Coef. Std. Err
    Local wage 0.354 0.008
    Wage difference ratio 0.547 0.003
    Unemployment 0.029 0.002
    CPI 0.589 0.026
    For xtreg the results are
    Coef. Std. Err
    Local wage -0.268 0.065
    Wage difference ratio 0.046 0.015
    Unemployment -0.072 0.005
    CPI 3.99 0.208
    Could someone give me a hint as to why the two different methods give completely different results?
    Thank you all so much

  • #2
    xtreg requires you to declare a panel identifier, and estimation after specifying the -fe- option includes those panel fixed effects.

    Comment


    • #3
      Originally posted by Andrew Musau View Post
      xtreg requires you to declare a panel identifier, and estimation after specifying the -fe- option includes those panel fixed effects.
      Hi Andrew, I didn't include that part but the panel identification is identical throughout the different regression approaches that I've used.

      Comment


      • #4
        You declare data to be panel using xtset


        [XT] xtset -- Declare data to be panel data
        (View complete PDF manual entry)


        Syntax

        Declare data to be panel

        xtset panelvar

        xtset panelvar timevar [, tsoptions]

        Therefore, what I mean in #2 is

        Code:
        webuse grunfeld
        xtset company
        xtreg invest kstock, fe

        already includes the company fixed effects as I have declared "company" as the panel identifier. With regress (including indicators), I would estimate the model as

        Code:
        regress invest kstock i.company

        Last edited by Andrew Musau; 17 Aug 2021, 09:18.

        Comment


        • #5
          @Andrew Musau
          Thanks for the comment. As I'm working with individual data, User ID is my panel identifier and does not appear in any of my fe specifications. Do you think this could be the problem? (not including User ID in the specifications) My study is about min wage and how it affects the number of hours that individuals work.

          Comment


          • #6
            My study is about min wage and how it affects the number of hours that individuals work.
            Yes, as your unit of analysis is the individual and you have panel data, you should take into account individual fixed effects.


            User ID is my panel identifier and does not appear in any of my fe specifications.
            It does in your xtreg estimation as you have xtset your data. The -vce(robust)- in xtreg is equivalent to -cluster(User_ID)-, so the estimation in reghdfe (SSC) should be


            Code:
            reghdfe hours_worked $control , vce(cluster User_ID) abs(year industry1 level1 User_ID)

            Some of the fixed effects may be redundant, e.g., if individuals are nested within industries - but it does not matter that you include both variables within -absorb()-.

            Comment


            • #7
              @Andrew Musau
              Again thank you so much for all the comments. They have been extremely helpful.

              Now that I've updated my codes for reghdfe, they are giving me identical results compared to xtset.
              Like you said in my previous reg regression I didn't include individual fixed effects. I've tried to include i.user_id (code is given below) but now stata is saying that maxvar is too small. I've tried to set the maxvar to 120,000 but the same problem still persists.
              Code:
              reg hours_worked $control i.year i.industry1 i.user_id, vce(robust)
              Given the nature of my study do you think it's safe to disregard reg and just focus on xtset and reghdfe? Is there a way to bypass the maxvar issue?

              Comment


              • #8
                With all those indicators, you are asking regress to invert a very large matrix and it is telling you that it has no scope to do that. However, regress does allow areg syntax, so you can do

                Code:
                reg hours_worked $control i.year i.industry1 , absorb(user_id) cluster(user_id)
                Two points:

                1. The estimator does not really matter. You have panel data, so you should prefer xtreg and reghdfe over regress.
                2. If using regress or reghdfe, the option -robust- computes Huber-White standard errors. These are not appropriate with panel data, so you need to specifically cluster on the panel variable. xtreg gets the clustering correct with the option -robust-.

                Comment


                • #9
                  @Andrew Musau Thank you for the comments. I think all my problems are solved.

                  Comment

                  Working...
                  X