Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • reghdfe : how can I read an omitted variable's estimated coefficient from estimated constant?

    I'm considering the following linear model:
    yijt = a * shareijt + b * (1-shareijt) + FE_Ii + FE_Jj + eijt



    which is rewritten as:

    yijt = b + (a-b) * shareijt + FE_Ii + FE_Jj + eit



    I'm interested in estimating the coefficients a and b.

    To control the two fixed effects, I use reghdfe package (version 5.7.3 13nov2019), but I'm having trouble retrieving b from the estimated constant from the following code :

    Code:
    reghdfe y share, absorb(FE_I FE_J) noconstant
    If I know what is the baseline i and j, I think adding the baseline i and j's fixed effects will give me b.

    Unfortunately, I couldn't find from the estimation result which i and j the package picks.

    I would really appreciate having thoughts on solving this issue.



    - example data :


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int(rep78 FE_I) byte FE_J double(share one_minus_share)
    3 0 0 14.512820512820513 -13.512820512820513
    2 0 0 17.475728155339805 -16.475728155339805
    3 0 0 19.054726368159205 -18.054726368159205
    3 0 0   20.7725321888412   -19.7725321888412
    3 0 0  17.02020202020202  -16.02020202020202
    4 0 0 19.032258064516128 -18.032258064516128
    5 1 0 13.006134969325153 -12.006134969325153
    4 0 0 18.454545454545453 -17.454545454545453
    3 1 0  12.94478527607362  -11.94478527607362
    3 0 0 17.547169811320753 -16.547169811320753
    3 0 0 16.581632653061224 -15.581632653061224
    3 0 0 13.333333333333334 -12.333333333333334
    3 0 0  20.52173913043478  -19.52173913043478
    3 0 0 19.563106796116504 -18.563106796116504
    2 0 0 15.363128491620111 -14.363128491620111
    3 0 0 21.029411764705884 -20.029411764705884
    4 0 0 15.266272189349113 -14.266272189349113
    3 0 0 15.970149253731343 -14.970149253731343
    3 1 0 19.176470588235293 -18.176470588235293
    2 0 0               16.1 -15.100000000000001
    3 0 0 16.834862385321102 -15.834862385321102
    3 0 0  15.75268817204301  -14.75268817204301
    2 0 0  19.11764705882353  -18.11764705882353
    3 0 0 19.592760180995477 -18.592760180995477
    3 0 1  11.89655172413793  -10.89655172413793
    4 0 1 13.411764705882353 -12.411764705882353
    4 0 1  12.55813953488372  -11.55813953488372
    4 0 1 12.756410256410257 -11.756410256410257
    5 1 1 13.161290322580646 -12.161290322580646
    5 0 1 14.973544973544973 -13.973544973544973
    5 0 1 15.257142857142858 -14.257142857142858
    5 0 1 13.023255813953488 -12.023255813953488
    5 1 1               12.5               -11.5
    4 1 1 11.812080536912752 -10.812080536912752
    4 0 1 13.941176470588236 -12.941176470588236
    end
    label values FE_J origin
    label def origin 0 "Domestic", modify
    label def origin 1 "Foreign", modify

    - result 1 :
    Code:
    reghdfe y share, absorb(FE_I FE_J) noconstant
    Code:
    HDFE Linear regression                            Number of obs   =         35
    Absorbing 2 HDFE groups                           F(   1,     31) =       0.13
                                                      Prob > F        =     0.7257
                                                      R-squared       =     0.5177
                                                      Adj R-squared   =     0.4710
                                                      Within R-sq.    =     0.0040
                                                      Root MSE        =     0.6681
    
    ------------------------------------------------------------------------------
           rep78 | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
           share |   .0205182   .0579482     0.35   0.726     -.097668    .1387044
           _cons |   3.129253   .9328981     3.35   0.002     1.226595    5.031911
    ------------------------------------------------------------------------------
    
    Absorbed degrees of freedom:
    -----------------------------------------------------+
     Absorbed FE | Categories  - Redundant  = Num. Coefs |
    -------------+---------------------------------------|
            FE_I |         2           0           2     |
            FE_J |         2           1           1     |
    -----------------------------------------------------+




    - result 2 :

    Code:
    reghdfe y share one_minus_share, absorb(FE_I FE_J) noconstant
    Code:
    HDFE Linear regression                            Number of obs   =         35
    Absorbing 2 HDFE groups                           F(   1,     31) =       0.13
                                                      Prob > F        =     0.7257
                                                      R-squared       =     0.5177
                                                      Adj R-squared   =     0.4710
                                                      Within R-sq.    =     0.0040
                                                      Root MSE        =     0.6681
    
    ---------------------------------------------------------------------------------
              rep78 | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    ----------------+----------------------------------------------------------------
              share |   .0205182   .0579482     0.35   0.726     -.097668    .1387044
    one_minus_share |          0  (omitted)
    ---------------------------------------------------------------------------------
    
    Absorbed degrees of freedom:
    -----------------------------------------------------+
     Absorbed FE | Categories  - Redundant  = Num. Coefs |
    -------------+---------------------------------------|
            FE_I |         2           0           2     |
            FE_J |         2           1           1     |
    -----------------------------------------------------+

  • #2
    As you have seen, you cannot put share and a variable equal to 1-share in the regression together because they are colinear, so one of them disappears.

    You can reconstruct a and b from your -reghdfe- results easily. b is just the constant term, and you can read it directly from the reghdfe output as 3.12953. Then, share coefficient = a-b. Consequently a = share coefficient + b. So you can have Stata do the arithmetic for you with:
    Code:
    lincom _cons + share
    will show you the value of a, along with its standard error, confidence interval, and test statistics.

    Comment


    • #3
      Thank you very much for the response.

      I think the question was a bit vague. I brought a simpler example.

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input int y byte category double(share one_minus_share)
      3 0 14.512820512820513 -13.512820512820513
      2 0 17.475728155339805 -16.475728155339805
      3 0 19.054726368159205 -18.054726368159205
      3 0   20.7725321888412   -19.7725321888412
      3 0  17.02020202020202  -16.02020202020202
      4 0 19.032258064516128 -18.032258064516128
      5 0 13.006134969325153 -12.006134969325153
      4 0 18.454545454545453 -17.454545454545453
      3 0  12.94478527607362  -11.94478527607362
      3 0 17.547169811320753 -16.547169811320753
      3 0 16.581632653061224 -15.581632653061224
      3 0 13.333333333333334 -12.333333333333334
      3 0  20.52173913043478  -19.52173913043478
      3 0 19.563106796116504 -18.563106796116504
      2 0 15.363128491620111 -14.363128491620111
      3 0 21.029411764705884 -20.029411764705884
      4 0 15.266272189349113 -14.266272189349113
      3 0 15.970149253731343 -14.970149253731343
      3 0 19.176470588235293 -18.176470588235293
      2 0               16.1 -15.100000000000001
      3 0 16.834862385321102 -15.834862385321102
      3 0  15.75268817204301  -14.75268817204301
      2 0  19.11764705882353  -18.11764705882353
      3 0 19.592760180995477 -18.592760180995477
      3 1  11.89655172413793  -10.89655172413793
      4 1 13.411764705882353 -12.411764705882353
      4 1  12.55813953488372  -11.55813953488372
      4 1 12.756410256410257 -11.756410256410257
      5 1 13.161290322580646 -12.161290322580646
      5 1 14.973544973544973 -13.973544973544973
      5 1 15.257142857142858 -14.257142857142858
      5 1 13.023255813953488 -12.023255813953488
      5 1               12.5               -11.5
      4 1 11.812080536912752 -10.812080536912752
      4 1 13.941176470588236 -12.941176470588236
      end
      label values category origin
      label def origin 0 "Domestic", modify
      label def origin 1 "Foreign", modify
      Here is the first regression that I believe is correct :

      Code:
      reg y share one_minus_share i.category, noconstant
      
            Source |       SS           df       MS      Number of obs   =        35
      -------------+----------------------------------   F(3, 32)        =    297.61
             Model |  431.533504         3  143.844501   Prob > F        =    0.0000
          Residual |  15.4664958        32  .483327994   R-squared       =    0.9654
      -------------+----------------------------------   Adj R-squared   =    0.9622
             Total |         447        35  12.7714286   Root MSE        =    .69522
      
      ---------------------------------------------------------------------------------
                    y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      ----------------+----------------------------------------------------------------
                share |    3.29882   .9365876     3.52   0.001     1.391054    5.206587
      one_minus_share |   3.314644   .9929345     3.34   0.002     1.292102    5.337185
                      |
             category |
             Foreign  |   1.257999   .3422225     3.68   0.001     .5609143    1.955083
      ---------------------------------------------------------------------------------

      The following is what you recommended, which is also correct :

      Code:
       reg y share i.category 
      
            Source |       SS           df       MS      Number of obs   =        35
      -------------+----------------------------------   F(2, 32)        =     13.68
             Model |  13.2192185         2  6.60960924   Prob > F        =    0.0001
          Residual |  15.4664958        32  .483327994   R-squared       =    0.4608
      -------------+----------------------------------   Adj R-squared   =    0.4271
             Total |  28.6857143        34  .843697479   Root MSE        =    .69522
      
      ------------------------------------------------------------------------------
                 y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
             share |  -.0158238   .0569672    -0.28   0.783    -.1318622    .1002145
                   |
          category |
          Foreign  |   1.257999   .3422225     3.68   0.001     .5609143    1.955083
             _cons |   3.314644   .9929345     3.34   0.002     1.292102    5.337185
      ------------------------------------------------------------------------------
      
      . lincom _cons+share
      
       ( 1)  share + _cons = 0
      
      ------------------------------------------------------------------------------
                 y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
               (1) |    3.29882   .9365876     3.52   0.001     1.391054    5.206587
      ------------------------------------------------------------------------------

      However, if I do the same exercise with reghdfe :

      Code:
      
      . reghdfe y share, absorb(category, savefe) 
      (MWFE estimator converged in 1 iterations)
      
      HDFE Linear regression                            Number of obs   =         35
      Absorbing 1 HDFE group                            F(   1,     32) =       0.08
                                                        Prob > F        =     0.7830
                                                        R-squared       =     0.4608
                                                        Adj R-squared   =     0.4271
                                                        Within R-sq.    =     0.0024
                                                        Root MSE        =     0.6952
      
      ------------------------------------------------------------------------------
                 y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
             share |  -.0158238   .0569672    -0.28   0.783    -.1318622    .1002145
             _cons |   3.710015   .9179141     4.04   0.000     1.840285    5.579745
      ------------------------------------------------------------------------------
      
      Absorbed degrees of freedom:
      -----------------------------------------------------+
       Absorbed FE | Categories  - Redundant  = Num. Coefs |
      -------------+---------------------------------------|
          category |         2           0           2     |
      -----------------------------------------------------+
      
      . tab __hdfe1__
      
             [FE] |
       1.category |      Freq.     Percent        Cum.
      ------------+-----------------------------------
         -.395371 |         24       68.57       68.57
         .8626276 |         11       31.43      100.00
      ------------+-----------------------------------
            Total |         35      100.00
      
      
      . lincom _cons + share
      
       ( 1)  share + _cons = 0
      
      ------------------------------------------------------------------------------
                 y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
               (1) |   3.694191   .8614465     4.29   0.000     1.939482      5.4489
      ------------------------------------------------------------------------------

      I think _cons captures the mean of fixed effects and also b - a mentioned above.

      If I sum the fixed effect of the first category (-.395371) with the constant (3.710015), I get the coefficient of one_minus_share (3.314644) correctly.

      On that, if I add the estimated coefficient of share (-.0158238), I get the share coefficient (3.2988202) correctly.

      These are all natural, but I had to use one of the estimated fixed effects, which is still the case with the noconstant option.


      My original exercise contains multiple fixed effects with numerous categories on each of them. I do not know which fixed effect(s) to pick to add to the estimated coefficients.


      Comment


      • #4
        Ah, yes, I see your problem. I don't know the solution here. When you do a fixed effects regression, whether with -reghdfe- or with -xtreg, fe-, or -areg-, the constant term and the fixed effects cannot actually be identified. If I add an arbitrary number to the constant term and then subtract that same number from each of the "fixed effects," I have exactly the same model.

        Each software program uses some particular constraint to separately identify the constant and the fixed effects. So one would have to understand exactly how that is done to figure out what to do with a situation like yours. In fact, even knowing that, I'm not sure that it is possible to propose a solution. It seems to me that b, _cons, and the fixed effects are all unidentified in this model and that some arbitrary constraint would be needed to identify them.

        If somebody else sees a way around this, I hope they will chime in here.

        Comment


        • #5
          Thank you very much Clyde.

          Comment

          Working...
          X