Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • equality of coefficients test with reghdfe: subsample analysis or interaction terms?

    I am trying to test the impact that margin trading status has on stock liquidity. I regress liquidity measure on a dummy variable (mt_csmar) which indicates whether a stock is margin eligiable or not, and some other control variables. I control firm fixed effect and time effect by regressing using reghdfe. This is fine.

    But I would also like to see whether the coefficient of this mt_csmar dummy differs in bull or bear markets, compared with normal periods. I could run the same specifications for subsamples, such as during bull periods, bear periods or normal periods. However, it seems that, with reghdfe, I cannot test whether the coefficients on mt_csmar are statistically different in different subsamples. Hence I decided to use interaction terms, interacting bull and bear dummies with the mt_csmar.

    I report the results for subsample analysis, and results for specifications using interaction terms below. From the subsample analysis, you can see that coefficients of mt_csmar is bear markets are positive, theose in bull and normal periods are both negative, but it is largeer in absoule terms during nromal periods. However, if I use interaction terms to do the analysis, the results are very different. coefficients on mt_csmar are more nagative during bull and bear markets than during normal periods.

    My questions are :
    1) which sets of results should I trust? Why the interaction terms analysis produce different conclusions than the subsample analysis?
    2) if I should trust subsample results, then, how do I test the equality of coefficients after running the reghdfe?
    3) bull and bear dummies are basically correlated with time effect, they were dropped in the reghdfe regressions when there are time fixed effects dummies, does this represent a problem? should i give up controlling time fixed effect and use xtreg instead?

    Thank you very much for any comments you have, I have been go back and forth on this points for a while.





    The following is bull market subsample:

    Click image for larger version

Name:	1.png
Views:	1
Size:	63.4 KB
ID:	1480187






    The following is bear market subsample
    Click image for larger version

Name:	1.png
Views:	1
Size:	68.1 KB
ID:	1480188



    The following is normal periods subsample
    Click image for larger version

Name:	1.png
Views:	1
Size:	68.3 KB
ID:	1480189







    The following is specifications with interaction terms

    Click image for larger version

Name:	1.png
Views:	1
Size:	94.2 KB
ID:	1480190

  • #2
    Please read the Forum FAQ for advice about how to show Stata output in ways that are helpful. Your screenshots here are barely readable, as is often the case, which is why people are specifically asked not to use them.

    That said, your interaction model is done wrong, so you should not attempt to interpret its results or compare them to anything else. You should not be using separate variables for bull and bear markets. You should have a single variable that takes on three values, for example, 0 = normal times, 1 = bear market, 2 = bull market. Let's call this variable condition. Then your regression command should be
    Code:
    reghdfe outcome_variable i.condition##c.mt_csmar other_variables, absorb(stkcode date_n) cluster(stkcode date_n)
    And then after that you should do:

    Code:
    margins condition, dydx(mt_csmar)
    to get the marginal effects of mt_csmar under each of the three conditions.

    As for contrasting those marginal effects, you can do that by testing the interaction coefficients from the -reghfde- equation:
    Code:
    test 1.condition#mt_csmar 2.condition#mt_csmar
    for an omnibus test of the null hypothesis that the effect of mt_csmar is the same in all three conditions. If you want to specifically contrast, say bull vs bear, then that would be
    Code:
    test 1.condition#mt_csmar = 2.condition#mt_csmar

    Comment


    • #3
      Thank you Clyde for your suggestions. For posting the Stata output, I did copy the table as pictures and save as png files. PNG files seem to be encouraged. I then use image icon to send these pictures to the server. I guess I have tried to put everything in one big picture which makes it harder to read. I will try to break the code lines and results seperatly this time.

      I have tried what you have suggested and create another variable conditions which are 0, 1, 2 for normal, bear and bull markets, and run reghdfe with the interaction of conditions and my dummy variable mt_csmar.

      Firstly, conditions are still automatically dropped due to collinearity with the time fixed effect (date_n variable). Secondly, when I try to get the marignal effect of mt_csmar, nothing can be estimated. I have posted the results below.

      In addition, I do not understand why cannot create two sepreate dummy variables and observe the coeffiicents of the interactions terms between these two dummies and mt_csmar to judge whether mt_csmar's impact on the outcome variables is stronger or weaker than the normal periods. Could you please provide more explanations? Thank you.

      Code:

      reghdfe ln_efsnew i.conditions##c.mt_csmar control-variables , absorb(stkcd date_n) cluster(stkcd date_n)

      Result:
      Click image for larger version

Name:	1.png
Views:	1
Size:	67.2 KB
ID:	1480201




      Then I do: margins conditions, dydx(mt_csmar)
      Here is the result:

      Click image for larger version

Name:	1.png
Views:	1
Size:	16.0 KB
ID:	1480202


      I am not sure what went wrong.


      Comment


      • #4
        I noticed that, using conditions produce the same results (coefficients) with producing two dummies (bull and bear). So I guess, both methods are fine. This then goes to my previous questions, why subsample analysis produces different conclusions with analysis using interaction terms. Anyone has any ideas?

        Comment


        • #5
          The best way to show Stata output is to copy it from your Results window or log file and past it directly into the Forum editor, surrounded by code delimiters. If you are not familiar with code delimiters, read Forum FAQ #12. If you do that, there will be no readability issues. Yes, .png files are better than other images, but are still not as good--your results are, again, barely readable, though the margins output is easily read. But with copy/paste and code delimiters there will never be a problem.

          In this particular model, there is no problem from using separate bull and bear indicators instead of a three-level one. But in more complicated models, it could make a difference. My focus on this is to have you code the model in a way that supports the use of -margins-. With only the complexity inherent in your model, -margins- will get it right with bull and bear separate. But in more complicated models, -margins- will get it wrong. Rather than having to think about every model you run to figure out whether -margins- will work properly with it, it is better to just get into the habit of doing it the right way every time, even if that isn't necessary in some particular case.

          Your original model, however, would still be incorrect as originally coded because you used c.bull and c.bear, where as you must use i.bull and i.bear. -margins- uses a different way of calculating the marginal effects for continuous and discrete variables, so it is important to get this right.

          I'm sorry about the non-estimability. I forgot to add that you need the -noestimcheck- option in this command. In fixed effects models, many of the parameters one might try to estimate with -margins- are in fact no identifiable, and -margins- tells you so with the (not estimable) result. But -margins- overdoes this. The marginal effects are, in fact, identifiable. The -noestimcheck- option allows you to override -margins-' difficulties here. (But do not do this with abandon: here you really need to know which parameters are identifiable and which are not. Anything that is a function of the fixed effect themselves is non-identifiable.)

          The omission of 1.conditions and 2.conditions is due to their colinearity with the stkcd fixed effects and is expected; it is not a problem. In fact, if this did not happen, it would be an indication that there is something wrong in the data!

          As for the difference between what you are getting from the interaction approach and the separate samples, this arises from the other variables in the model. When you do the interaction model, you are constraining the coefficients of all the other variables to be independent of bull/bear/normal. When you do the separate samples, you get separate estimate of those other variables' coefficients in each sample. Since those other variables are, themselves, correlated with outcome or with bull/bear/normal, this results in changes in the bull/bear/normal effects. I haven't carefully reviewed all of the coefficients in the three separate samples output you show (and, in fact, I can't read some of them at all), but even just a casual review shows that there are some very substantial differences in these other coefficients across those three models.

          This suggests that the implicit constraint of equality of coefficients imposed by the interaction model is not suitable for this data. You can relax this constraint by adding to your interaction model more terms that provide for interaction between bull/bear/normal and the other variables:

          Code:
          reghdfe ln_efsnew i.conditions##(c.mt_csmar control-variables) , absorb(stkcd date_n) cluster(stkcd date_n)
          will do that and will give you results that are the same as you got from your separate samples (with perhaps some small differences due to numerical issues). And then you can do
          Code:
          margins conditions, dydx(mt_csmar) noestimcheck // MARGINAL EFFECXT OF mt_csmar
          test 1.conditions#mt_csmar 2.conditions#mt_csmar // OMNIBUS TEST OF INTERACTION
          test 1.conditions#mt_csmar = 2.conditions#mt_csmar // TEST OF bull = bear

          Comment


          • #6
            Originally posted by Clyde Schechter View Post
            The best way to show Stata output is to copy it from your Results window or log file and past it directly into the Forum editor, surrounded by code delimiters. If you are not familiar with code delimiters, read Forum FAQ #12. If you do that, there will be no readability issues. Yes, .png files are better than other images, but are still not as good--your results are, again, barely readable, though the margins output is easily read. But with copy/paste and code delimiters there will never be a problem.

            In this particular model, there is no problem from using separate bull and bear indicators instead of a three-level one. But in more complicated models, it could make a difference. My focus on this is to have you code the model in a way that supports the use of -margins-. With only the complexity inherent in your model, -margins- will get it right with bull and bear separate. But in more complicated models, -margins- will get it wrong. Rather than having to think about every model you run to figure out whether -margins- will work properly with it, it is better to just get into the habit of doing it the right way every time, even if that isn't necessary in some particular case.

            Your original model, however, would still be incorrect as originally coded because you used c.bull and c.bear, where as you must use i.bull and i.bear. -margins- uses a different way of calculating the marginal effects for continuous and discrete variables, so it is important to get this right.

            I'm sorry about the non-estimability. I forgot to add that you need the -noestimcheck- option in this command. In fixed effects models, many of the parameters one might try to estimate with -margins- are in fact no identifiable, and -margins- tells you so with the (not estimable) result. But -margins- overdoes this. The marginal effects are, in fact, identifiable. The -noestimcheck- option allows you to override -margins-' difficulties here. (But do not do this with abandon: here you really need to know which parameters are identifiable and which are not. Anything that is a function of the fixed effect themselves is non-identifiable.)

            The omission of 1.conditions and 2.conditions is due to their colinearity with the stkcd fixed effects and is expected; it is not a problem. In fact, if this did not happen, it would be an indication that there is something wrong in the data!

            As for the difference between what you are getting from the interaction approach and the separate samples, this arises from the other variables in the model. When you do the interaction model, you are constraining the coefficients of all the other variables to be independent of bull/bear/normal. When you do the separate samples, you get separate estimate of those other variables' coefficients in each sample. Since those other variables are, themselves, correlated with outcome or with bull/bear/normal, this results in changes in the bull/bear/normal effects. I haven't carefully reviewed all of the coefficients in the three separate samples output you show (and, in fact, I can't read some of them at all), but even just a casual review shows that there are some very substantial differences in these other coefficients across those three models.

            This suggests that the implicit constraint of equality of coefficients imposed by the interaction model is not suitable for this data. You can relax this constraint by adding to your interaction model more terms that provide for interaction between bull/bear/normal and the other variables:

            Code:
            reghdfe ln_efsnew i.conditions##(c.mt_csmar control-variables) , absorb(stkcd date_n) cluster(stkcd date_n)
            will do that and will give you results that are the same as you got from your separate samples (with perhaps some small differences due to numerical issues). And then you can do
            Code:
            margins conditions, dydx(mt_csmar) noestimcheck // MARGINAL EFFECXT OF mt_csmar
            test 1.conditions#mt_csmar 2.conditions#mt_csmar // OMNIBUS TEST OF INTERACTION
            test 1.conditions#mt_csmar = 2.conditions#mt_csmar // TEST OF bull = bear
            This is very helpful. I have tried the above codes, and indeed, the interaction models now produce similar conclusions to those using separate samples. I am really grateful for the help!

            Comment


            • #7
              Hi, Statalist

              I came across the similar problem again when I am trying a different specification. This time, even allowing the other control variables to be different in different groups of stocks still produce inconsistent results in the interaction models compared with in sub samples.
              The following is my code and results for interaction model:

              Code:
              . reghdfe ln_efsnew i.IO_3group##(c.finance_turnover c.short_turnover c.lag_ln_efsnew c.ln_firmsize c.ln_volatility_20 c.ln_volume c.ln_price c.
              > return c.ln_ownerratio_csmar i.HS300 i.PL ), absorb(stkcd date_n) cluster(stkcd date_n)
              (MWFE estimator converged in 8 iterations)
              Warning: VCV matrix was non-positive semi-definite; adjustment from Cameron, Gelbach & Miller applied.
              note: 2.IO_3group#1.HS300 omitted because of collinearity
              
              HDFE Linear regression                            Number of obs   =    734,090
              Absorbing 2 HDFE groups                           F(  34,    960) =     262.23
              Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                                R-squared       =     0.7072
                                                                Adj R-squared   =     0.7062
              Number of clusters (stkcd)   =        961         Within R-sq.    =     0.2879
              Number of clusters (date_n)  =      1,634         Root MSE        =     0.2543
              
                                                          (Std. Err. adjusted for 961 clusters in stkcd date_n)
              -------------------------------------------------------------------------------------------------
                                              |               Robust
                                    ln_efsnew |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
              --------------------------------+----------------------------------------------------------------
                                    IO_3group |
                                           1  |   3.905128   .9999033     3.91   0.000      1.94288    5.867376
                                           2  |   3.743145   .9979922     3.75   0.000     1.784647    5.701643
                                              |
                             finance_turnover |   .2115961   .0978413     2.16   0.031     .0195886    .4036035
                               short_turnover |   1.132953    1.19845     0.95   0.345    -1.218931    3.484836
                                lag_ln_efsnew |   .1486854    .038023     3.91   0.000     .0740676    .2233032
                                  ln_firmsize |   .0479427   .0561804     0.85   0.394    -.0623079    .1581934
                             ln_volatility_20 |   .0127298   .0169635     0.75   0.453      -.02056    .0460197
                                    ln_volume |   .0055563   .0155432     0.36   0.721    -.0249463     .036059
                                     ln_price |  -.5521989   .0558957    -9.88   0.000    -.6618907    -.442507
                                       return |   .0499057   .1776742     0.28   0.779    -.2987688    .3985803
                          ln_ownerratio_csmar |   .0008377     .01502     0.06   0.956    -.0286381    .0303135
                                      1.HS300 |  -.0527273   .0082782    -6.37   0.000    -.0689729   -.0364818
                                         1.PL |    .173827   .0490673     3.54   0.000     .0775355    .2701185
                                              |
                 IO_3group#c.finance_turnover |
                                           1  |   -.166646    .098774    -1.69   0.092    -.3604839    .0271918
                                           2  |  -.5111654    .101006    -5.06   0.000    -.7093835   -.3129473
                                              |
                   IO_3group#c.short_turnover |
                                           1  |  -.9870743   1.208885    -0.82   0.414    -3.359437    1.385288
                                           2  |  -.7433824   1.213513    -0.61   0.540    -3.124827    1.638062
                                              |
                    IO_3group#c.lag_ln_efsnew |
                                           1  |   .1423081   .0390265     3.65   0.000      .065721    .2188952
                                           2  |   .1851955   .0388168     4.77   0.000       .10902     .261371
                                              |
                      IO_3group#c.ln_firmsize |
                                           1  |   -.135036   .0563768    -2.40   0.017     -.245672   -.0244001
                                           2  |  -.1079761   .0563595    -1.92   0.056    -.2185782    .0026261
                                              |
                 IO_3group#c.ln_volatility_20 |
                                           1  |   .0168273   .0173054     0.97   0.331    -.0171336    .0507881
                                           2  |   .0564783   .0175161     3.22   0.001     .0221041    .0908525
                                              |
                        IO_3group#c.ln_volume |
                                           1  |  -.0157246   .0157157    -1.00   0.317    -.0465656    .0151165
                                           2  |   -.027011   .0158036    -1.71   0.088    -.0580246    .0040026
                                              |
                         IO_3group#c.ln_price |
                                           1  |   .2099068   .0550157     3.82   0.000     .1019418    .3178718
                                           2  |   .3246516   .0552185     5.88   0.000     .2162887    .4330145
                                              |
                           IO_3group#c.return |
                                           1  |   .2230658   .1808798     1.23   0.218    -.1318995    .5780312
                                           2  |  -.1364572   .1809457    -0.75   0.451     -.491552    .2186375
                                              |
              IO_3group#c.ln_ownerratio_csmar |
                                           1  |    .045964   .0148954     3.09   0.002     .0167328    .0751953
                                           2  |   .0365486   .0160123     2.28   0.023     .0051255    .0679718
                                              |
                              IO_3group#HS300 |
                                         0 1  |          0  (empty)
                                         1 1  |   .0295765   .0095904     3.08   0.002     .0107559    .0483972
                                         2 1  |          0   1.54e-13     0.00   1.000    -3.02e-13    3.02e-13
                                              |
                                 IO_3group#PL |
                                         1 1  |  -.0485219   .0498779    -0.97   0.331    -.1464041    .0493604
                                         2 1  |  -.0535655   .0505084    -1.06   0.289    -.1526852    .0455541
                                              |
                                        _cons |  -5.377585   .9948678    -5.41   0.000    -7.329952   -3.425218
              -------------------------------------------------------------------------------------------------
              
              Absorbed degrees of freedom:
              -----------------------------------------------------+
               Absorbed FE | Categories  - Redundant  = Num. Coefs |
              -------------+---------------------------------------|
                     stkcd |       961         961           0    *|
                    date_n |      1634        1634           0    *|
              -----------------------------------------------------+
              * = FE nested within cluster; treated as redundant for DoF computation
              
              . margins IO_3group , dydx(short_turnover) noestimcheck
              
              Average marginal effects                        Number of obs     =    734,090
              Model VCE    : Robust
              
              Expression   : Linear prediction, predict()
              dy/dx w.r.t. : short_turnover
              
              --------------------------------------------------------------------------------
                             |            Delta-method
                             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
              ---------------+----------------------------------------------------------------
              short_turnover |
                   IO_3group |
                          0  |   1.132953    1.19845     0.95   0.344    -1.215966    3.481871
                          1  |   .1458785   .1863425     0.78   0.434    -.2193461    .5111031
                          2  |   .3895703   .2076505     1.88   0.061    -.0174171    .7965578
              --------------------------------------------------------------------------------
              And this is my result for subsample:
              Code:
               reghdfe ln_efsnew finance_turnover short_turnover lag_ln_efsnew ln_firmsize  ln_volatility_20 ln_volume   ln_price return   ln_ownerratio_csma
              > r  HS300  PL if  IO_3group==0 , absorb(stkcd date_n) cluster(stkcd date_n)
              (dropped 26 singleton observations)
              (MWFE estimator converged in 10 iterations)
              Warning: VCV matrix was non-positive semi-definite; adjustment from Cameron, Gelbach & Miller applied.
              note: HS300 omitted because of collinearity
              
              HDFE Linear regression                            Number of obs   =      3,260
              Absorbing 2 HDFE groups                           F(  10,     21) =      25.70
              Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                                R-squared       =     0.6965
                                                                Adj R-squared   =     0.6424
              Number of clusters (stkcd)   =         22         Within R-sq.    =     0.1148
              Number of clusters (date_n)  =        462         Root MSE        =     0.2125
              
                                               (Std. Err. adjusted for 22 clusters in stkcd date_n)
              -------------------------------------------------------------------------------------
                                  |               Robust
                        ln_efsnew |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
              --------------------+----------------------------------------------------------------
                 finance_turnover |   .2395584   .1211596     1.98   0.061    -.0124067    .4915236
                   short_turnover |   2.010175   1.259615     1.60   0.125    -.6093366    4.629687
                    lag_ln_efsnew |   .1043233   .0293104     3.56   0.002     .0433691    .1652775
                      ln_firmsize |  -.5677539   .2445591    -2.32   0.030    -1.076342   -.0591655
                 ln_volatility_20 |  -.0249109   .0208915    -1.19   0.246    -.0683572    .0185354
                        ln_volume |   .0326493   .0176223     1.85   0.078    -.0039982    .0692968
                         ln_price |    .118167   .3625202     0.33   0.748     -.635735    .8720691
                           return |  -.2815352   .2154586    -1.31   0.205    -.7296059    .1665356
              ln_ownerratio_csmar |   .3814521    .142721     2.67   0.014     .0846476    .6782566
                            HS300 |          0   1.91e-18     0.00   1.000    -3.97e-18    3.97e-18
                               PL |   .1828604   .0504898     3.62   0.002     .0778612    .2878597
                            _cons |   6.364814   4.886876     1.30   0.207       -3.798    16.52763
              -------------------------------------------------------------------------------------
              
              Absorbed degrees of freedom:
              -----------------------------------------------------+
               Absorbed FE | Categories  - Redundant  = Num. Coefs |
              -------------+---------------------------------------|
                     stkcd |        22          22           0    *|
                    date_n |       462         462           0    *|
              -----------------------------------------------------+
              * = FE nested within cluster; treated as redundant for DoF computation
              
              . reghdfe ln_efsnew finance_turnover short_turnover lag_ln_efsnew ln_firmsize  ln_volatility_20 ln_volume   ln_price return   ln_ownerratio_csma
              > r  HS300  PL if  IO_3group==1 , absorb(stkcd date_n) cluster(stkcd date_n)
              (dropped 1 singleton observations)
              (MWFE estimator converged in 9 iterations)
              Warning: VCV matrix was non-positive semi-definite; adjustment from Cameron, Gelbach & Miller applied.
              
              HDFE Linear regression                            Number of obs   =    366,006
              Absorbing 2 HDFE groups                           F(  11,    704) =     475.43
              Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                                R-squared       =     0.7655
                                                                Adj R-squared   =     0.7639
              Number of clusters (stkcd)   =        705         Within R-sq.    =     0.3350
              Number of clusters (date_n)  =      1,634         Root MSE        =     0.2230
              
                                              (Std. Err. adjusted for 705 clusters in stkcd date_n)
              -------------------------------------------------------------------------------------
                                  |               Robust
                        ln_efsnew |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
              --------------------+----------------------------------------------------------------
                 finance_turnover |   .0273482   .0213712     1.28   0.201    -.0146108    .0693071
                   short_turnover |   .4839065   .1740851     2.78   0.006     .1421183    .8256947
                    lag_ln_efsnew |   .2818307   .0091497    30.80   0.000     .2638667    .2997947
                      ln_firmsize |  -.0780929   .0144979    -5.39   0.000    -.1065572   -.0496286
                 ln_volatility_20 |    .019398   .0045142     4.30   0.000     .0105351    .0282609
                        ln_volume |   .0020262   .0033809     0.60   0.549    -.0046116     .008664
                         ln_price |  -.3919089   .0161147   -24.32   0.000    -.4235475   -.3602703
                           return |   .2091176   .0523144     4.00   0.000     .1064066    .3118286
              ln_ownerratio_csmar |   .0372926   .0060865     6.13   0.000     .0253426    .0492425
                            HS300 |  -.0148609   .0081905    -1.81   0.070    -.0309417    .0012198
                               PL |   .1278662   .0109048    11.73   0.000     .1064565     .149276
                            _cons |   -1.90087   .2927042    -6.49   0.000    -2.475548   -1.326192
              -------------------------------------------------------------------------------------
              
              Absorbed degrees of freedom:
              -----------------------------------------------------+
               Absorbed FE | Categories  - Redundant  = Num. Coefs |
              -------------+---------------------------------------|
                     stkcd |       705         705           0    *|
                    date_n |      1634        1634           0    *|
              -----------------------------------------------------+
              * = FE nested within cluster; treated as redundant for DoF computation
              
              . reghdfe ln_efsnew finance_turnover short_turnover lag_ln_efsnew ln_firmsize  ln_volatility_20 ln_volume   ln_price return   ln_ownerratio_csma
              > r  HS300  PL if  IO_3group==2 , absorb(stkcd date_n) cluster(stkcd date_n)
              (MWFE estimator converged in 9 iterations)
              Warning: VCV matrix was non-positive semi-definite; adjustment from Cameron, Gelbach & Miller applied.
              
              HDFE Linear regression                            Number of obs   =    364,797
              Absorbing 2 HDFE groups                           F(  11,    738) =     277.00
              Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                                R-squared       =     0.6281
                                                                Adj R-squared   =     0.6257
              Number of clusters (stkcd)   =        739         Within R-sq.    =     0.1863
              Number of clusters (date_n)  =      1,634         Root MSE        =     0.2789
              
                                              (Std. Err. adjusted for 739 clusters in stkcd date_n)
              -------------------------------------------------------------------------------------
                                  |               Robust
                        ln_efsnew |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
              --------------------+----------------------------------------------------------------
                 finance_turnover |  -.2956001   .0262775   -11.25   0.000    -.3471876   -.2440126
                   short_turnover |    .142516   .2380393     0.60   0.550    -.3247988    .6098308
                    lag_ln_efsnew |   .3078628   .0084778    36.31   0.000     .2912193    .3245063
                      ln_firmsize |  -.0813076   .0141996    -5.73   0.000    -.1091841   -.0534311
                 ln_volatility_20 |   .0705702   .0056198    12.56   0.000     .0595375    .0816028
                        ln_volume |  -.0239334   .0033377    -7.17   0.000    -.0304859    -.017381
                         ln_price |  -.1848707   .0187478    -9.86   0.000    -.2216761   -.1480652
                           return |  -.0717832   .0565686    -1.27   0.205    -.1828378    .0392714
              ln_ownerratio_csmar |   .0474514   .0086297     5.50   0.000     .0305097    .0643931
                            HS300 |  -.0457838   .0098787    -4.63   0.000    -.0651775   -.0263901
                               PL |   .1160117   .0117391     9.88   0.000     .0929657    .1390576
                            _cons |  -1.373251    .302937    -4.53   0.000    -1.967972   -.7785304
              -------------------------------------------------------------------------------------
              
              Absorbed degrees of freedom:
              -----------------------------------------------------+
               Absorbed FE | Categories  - Redundant  = Num. Coefs |
              -------------+---------------------------------------|
                     stkcd |       739         739           0    *|
                    date_n |      1634        1634           0    *|
              -----------------------------------------------------+
              * = FE nested within cluster; treated as redundant for DoF computation
              You can see that, for example, the impact of short_turnover was not significant in the subsample IO_group=2, but it is signficantly positive in the interaction models. Also, the impact of short_turnover is signficantly positive in subsample IO_group==1, but it is not in the interaction model.

              I was suspecting the result might be different if I put an VCE (unconditional) option after margins, as this allows the clustered standard errors.

              However, the stata told me that VCE(unconditional) cannot be computed.
              Code:
              . margins IO_3group , dydx(short_turnover) noestimcheck vce(unconditional)
              cannot compute vce(unconditional);
              predict after could not compute scores
              So, I am stuck again. Not sure, why this time, allowing the coefficients of the control variables to vary across different groups does not work. In theory, it should produce very similar result with those in subsamples. Does anyone has any ideas? Thanks

              Comment


              • #8
                Dear Qing
                I think the problem here is the false assumption that estimating your fixed effect models using subsamples should provide the same results as estimating the same model with interactions.
                The premise is actually correct. If you interact every single explanatory variable, the results should be the same as estimating separate models. However, based on the results you provide, you are not interacting the fixed effects. This implies you are comparing two different set of models.
                The change in the marginal effects magnitudes or the statistical significance across both strategies could be explained exactly because of this differences in the approaches.
                HTH
                Fernando

                Comment


                • #9
                  Originally posted by FernandoRios View Post
                  Dear Qing
                  I think the problem here is the false assumption that estimating your fixed effect models using subsamples should provide the same results as estimating the same model with interactions.
                  The premise is actually correct. If you interact every single explanatory variable, the results should be the same as estimating separate models. However, based on the results you provide, you are not interacting the fixed effects. This implies you are comparing two different set of models.
                  The change in the marginal effects magnitudes or the statistical significance across both strategies could be explained exactly because of this differences in the approaches.
                  HTH
                  Fernando
                  Thanks Fernando for your comments, it makes a lot of sense. Do you have idea on why the VCE(unconditional) cannot be calculated, I was still kind of hoping it may solve the inconsistency, at least reduce the difference. The interaction approch is my preferred one as it allows me to compare the coefficients of short_turnover in different groups. I don't know whether there is any other way to achieve this after the reghdfe estimation.

                  Comment


                  • #10
                    Actually, is there a way to interact all explanatory variable including even the fixed effect? Or does it make sense to do it?

                    Comment


                    • #11
                      My best guess why that isnt working is because reghdfe has not been programmed to do so yet. Perhaps the author has more details on that on his github page, where he keeps the most update version of reghdfe.
                      In any case, the difference will remain the same, only standard errors will change.
                      Now, im not sure about the validity of the following exercise, in particular because of the clusters, but this may show you the results are comparable. Create additional variables, say stkcd2 and date_n2 that are the combination of the original stkcd date_n and your group variables. (perhaps something like egen stkcd2=group(stkcd IO3_group). And use this new variables to estimate the fixed effect models.
                      One more point. It may not be too important, but i think your baseline group is too small. you have 3,260 obs for about 500 explanatory variables. I would be careful making inferences from that first subsample model.
                      Fernando

                      Comment


                      • #12
                        The discrepancy between the interaction results and subset-specific results you are seeing in #7 is because the estimation samples are not the same. If you look at your outputs carefully, you will see that in some of the sample-specific outputs there is a note that certain observations are being dropped because they are singletons. By contrast, no observations are dropped in the interaction model. (I guess those observations become singletons when they are restricted to the subsamples.)

                        I do not know the inner workings of -reghdfe- and only use it occasionally myself. But I imagine that the dropping of singleton observations is occasioned by your use of clustered VCE in these models. I suspect if you use the ordinary VCE, then -reghdfe- will not omit any observations, and the interaction and sample-specific results will be consistent with each other.

                        Comment


                        • #13
                          Originally posted by FernandoRios View Post
                          My best guess why that isnt working is because reghdfe has not been programmed to do so yet. Perhaps the author has more details on that on his github page, where he keeps the most update version of reghdfe.
                          In any case, the difference will remain the same, only standard errors will change.
                          Now, im not sure about the validity of the following exercise, in particular because of the clusters, but this may show you the results are comparable. Create additional variables, say stkcd2 and date_n2 that are the combination of the original stkcd date_n and your group variables. (perhaps something like egen stkcd2=group(stkcd IO3_group). And use this new variables to estimate the fixed effect models.
                          One more point. It may not be too important, but i think your baseline group is too small. you have 3,260 obs for about 500 explanatory variables. I would be careful making inferences from that first subsample model.
                          Fernando
                          Hi, Fernando
                          I tried your suggestions by creating stkcd2 and date_n2, and used them as fixed effect in the interaction model, the results are now closer to those in the subsamples. Thanks.

                          Comment


                          • #14
                            Originally posted by Clyde Schechter View Post
                            The discrepancy between the interaction results and subset-specific results you are seeing in #7 is because the estimation samples are not the same. If you look at your outputs carefully, you will see that in some of the sample-specific outputs there is a note that certain observations are being dropped because they are singletons. By contrast, no observations are dropped in the interaction model. (I guess those observations become singletons when they are restricted to the subsamples.)

                            I do not know the inner workings of -reghdfe- and only use it occasionally myself. But I imagine that the dropping of singleton observations is occasioned by your use of clustered VCE in these models. I suspect if you use the ordinary VCE, then -reghdfe- will not omit any observations, and the interaction and sample-specific results will be consistent with each other.
                            Thanks Clyde for the comments, using ordinary vce indeed reduces the number of singletons to be dropped. However, it won't change the conclusions, results are very similar to those using cluster options.

                            Comment


                            • #15
                              Originally posted by Clyde Schechter View Post
                              The best way to show Stata output is to copy it from your Results window or log file and past it directly into the Forum editor, surrounded by code delimiters. If you are not familiar with code delimiters, read Forum FAQ #12. If you do that, there will be no readability issues. Yes, .png files are better than other images, but are still not as good--your results are, again, barely readable, though the margins output is easily read. But with copy/paste and code delimiters there will never be a problem.

                              In this particular model, there is no problem from using separate bull and bear indicators instead of a three-level one. But in more complicated models, it could make a difference. My focus on this is to have you code the model in a way that supports the use of -margins-. With only the complexity inherent in your model, -margins- will get it right with bull and bear separate. But in more complicated models, -margins- will get it wrong. Rather than having to think about every model you run to figure out whether -margins- will work properly with it, it is better to just get into the habit of doing it the right way every time, even if that isn't necessary in some particular case.

                              Your original model, however, would still be incorrect as originally coded because you used c.bull and c.bear, where as you must use i.bull and i.bear. -margins- uses a different way of calculating the marginal effects for continuous and discrete variables, so it is important to get this right.

                              I'm sorry about the non-estimability. I forgot to add that you need the -noestimcheck- option in this command. In fixed effects models, many of the parameters one might try to estimate with -margins- are in fact no identifiable, and -margins- tells you so with the (not estimable) result. But -margins- overdoes this. The marginal effects are, in fact, identifiable. The -noestimcheck- option allows you to override -margins-' difficulties here. (But do not do this with abandon: here you really need to know which parameters are identifiable and which are not. Anything that is a function of the fixed effect themselves is non-identifiable.)

                              The omission of 1.conditions and 2.conditions is due to their colinearity with the stkcd fixed effects and is expected; it is not a problem. In fact, if this did not happen, it would be an indication that there is something wrong in the data!

                              As for the difference between what you are getting from the interaction approach and the separate samples, this arises from the other variables in the model. When you do the interaction model, you are constraining the coefficients of all the other variables to be independent of bull/bear/normal. When you do the separate samples, you get separate estimate of those other variables' coefficients in each sample. Since those other variables are, themselves, correlated with outcome or with bull/bear/normal, this results in changes in the bull/bear/normal effects. I haven't carefully reviewed all of the coefficients in the three separate samples output you show (and, in fact, I can't read some of them at all), but even just a casual review shows that there are some very substantial differences in these other coefficients across those three models.

                              This suggests that the implicit constraint of equality of coefficients imposed by the interaction model is not suitable for this data. You can relax this constraint by adding to your interaction model more terms that provide for interaction between bull/bear/normal and the other variables:

                              Code:
                              reghdfe ln_efsnew i.conditions##(c.mt_csmar control-variables) , absorb(stkcd date_n) cluster(stkcd date_n)
                              will do that and will give you results that are the same as you got from your separate samples (with perhaps some small differences due to numerical issues). And then you can do
                              Code:
                              margins conditions, dydx(mt_csmar) noestimcheck // MARGINAL EFFECXT OF mt_csmar
                              test 1.conditions#mt_csmar 2.conditions#mt_csmar // OMNIBUS TEST OF INTERACTION
                              test 1.conditions#mt_csmar = 2.conditions#mt_csmar // TEST OF bull = bear
                              Hi Clyde, do you know how to deal with fixed effects variables in the absorb parenthesis in this case? it seems that in my case, because the fixed effect variables are not interacted with the conditions variable, this stacked regression still provides different estimates as I got from my separate samples.

                              Thanks,
                              Jacob

                              Comment

                              Working...
                              X