Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Asymmetric effects in FE model (Interaction of Dummies)

    Hi Stata-Forum Members,

    I have created a dataset that contains data on 112 companies across 84 quarters containing data on
    Customer Satisfaction (CS), Cashflows(CF), Earnings(E) R&D (RD) and Concentration (Conc) of the Industry and whether there was a Crisis (Crisis), a Dummy Variable that is 1 when there was a crisis and 0, otherwise.
    The dataset is not balanced since some companies only have data for years. (The average quarters measured is around 60).

    Here are the two things that I wanted to test:

    1. Does Customer Satisfaction has a positive impact on CashFlows
    1.b) Is this relationship non-linear.

    2. In times of a crisis, high CS mitigate negative effects of a crisis
    2b) In times of a crisis low CS does not exacerbate the negative effects of a crisis

    To test these hypothesis I used the following econometrical model:

    (1) CFi,t= ai,t+b1*CFi,t-1+b2*Ei,t-1+b3*CSi,t b4*CSi,t2+vi+ei,t

    where a is the intercept, b1,2 measure past cashflows and earnings and
    b3,4 are the variables of interest.

    I estimated this model with the commands xtregar and since its a dynamic model with xtabond. Both results indicate that b3 and b4 are positive and significant.

    (Q1) I hope these findings support my Hypothesis 1 and 1b).

    Now comes the problem: I dont want to test whether CS has a general effect in terms of a crisis, but I want to check whether high CS reduces the effects of a crisis and lowCS doesnt exacerbate the effects. Therefore I tested the following model:

    (2) CFi,t= ai,t+b1*CFi,t-1+b2*Ei,t-1+b3*CSi,t b4*Crisis+b5*HighCS+b6*LowCS+b7*Crisis*HighCS+b8*LowCS*Crisis+vi+ei,t

    The outcomes indicate that:
    b4 is significant and negative
    b5 is significant and positive
    b6 is insignificant
    b7 is significant and positive
    b8 is significant and positive

    HighCS and LowCS are both dummies, since I have 3 groups (HighCS NormalCS LowCS) I included 2 dummies to compare these groups with the group of firms that have averagely satisfied customers.

    (Q2) Can I draw the conclusion from my results that when there is a Crisis(Dummy=1) companies that have a high level of customer satisfaction (Dummy=1) suffer less from the negative effects of a crisis?

    Im am just so unsure because the Dummy Variable of Crisis mark certain points in time (6 quarters), while the Dummy of HighCustomer satisfaction is coded across the whole dataset with regards to time t.

    I already looked at:
    https://www.statalist.org/forums/for...dummy-variable
    but it didnt clarify to me, whether my model is set-up correctly.

    If you need any stata-outputs I can provide them as well.

    Kind regards,
    Damien




  • #2
    1. Does Customer Satisfaction has a positive impact on CashFlows
    1.b) Is this relationship non-linear.
    ...
    To test these hypothesis I used the following econometrical model:

    (1) CFi,t= ai,t+b1*CFi,t-1+b2*Ei,t-1+b3*CSi,t b4*CSi,t2+vi+ei,t
    ...
    I estimated this model with the commands xtregar and since its a dynamic model with xtabond. Both results indicate that b3 and b4 are positive and significant.

    (Q1) I hope these findings support my Hypothesis 1 and 1b).
    It does support 1b. As for 1a, maybe and maybe not. Since b4 is positive, the relationship between CF and CS, if graphed, is a parabola, so as CS increases CF actually decreases at first, reaches a minimum when CS = -b3/(2*b4), and then increases after that. So, if -b3/(2*b4) is less than any observed value of CS, you have an increasing curvilinear relationship between CF and CS. But if -b3/(2*b4) is greater than all observed values of CS, then you have a decreasing curvilinear relationship between CF and CS. Finally, if -b3/(2*b4) falls within the range of observed CS values, you have a U-shaped relationship.

    (Q2) Can I draw the conclusion from my results that when there is a Crisis(Dummy=1) companies that have a high level of customer satisfaction (Dummy=1) suffer less from the negative effects of a crisis?
    That looks correct.

    Speaking only for myself, and others may disagree, I find equations with subscripts disagreeable to read. I would have found your post easier to follow had you posted the actual Stata commands you ran and the actual Stata output they led to. A side benefit of doing it that way is that, for now, I can only answer your questions conditional on the assumption that you have actually implemented them correctly in unseen code. Had you shown the code and output, I could answer your questions conditional on what you actually did! (And perhaps I could have offered some tips on improving the code.)

    Comment


    • #3
      Hi Mr. Schechter,

      first of all, thank you for your consideration and taking time on sunday to help other people with their statistical problems.

      Speaking only for myself, and others may disagree, I find equations with subscripts disagreeable to read. I would have found your post easier to follow had you posted the actual Stata commands you ran and the actual Stata output they led to
      Thank you for your feedback, below I tried to post the code that I used and the results that it produced:

      This was the code that I ran to estimate my model (1) and to check for Hypothesis 1:
      Code:
      xtregar norm_income2 L1.norm_income2 L1.norm_income1 csat csatq, fe rhotype(dw)
      where norm_income2 was CF and norm_icome1 was earnings. Csat is CS and csatq is CS2.
      Code:
      FE (within) regression with AR(1) disturbances  Number of obs     =      6,971
      Group variable: newid                           Number of groups  =        121
      
      R-sq:                                           Obs per group:
           within  = 0.3243                                         min =         11
           between = 0.9824                                         avg =       57.6
           overall = 0.6996                                         max =         83
      
                                                      F(4,6846)         =     821.46
      corr(u_i, Xb)  = 0.7906                         Prob > F          =     0.0000
      
      ------------------------------------------------------------------------------
      norm_income2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      norm_income2 |
               L1. |   .5241718   .0106318    49.30   0.000     .5033302    .5450134
                   |
      norm_income1 |
               L1. |   .0205546   .0064437     3.19   0.001      .007923    .0331862
                   |
              csat |   .0003608   .0000618     5.84   0.000     .0002397    .0004818
             csatq |   .0000156   4.86e-06     3.20   0.001     6.04e-06    .0000251
             _cons |   .0166367    .000452    36.81   0.000     .0157507    .0175227
      -------------+----------------------------------------------------------------
            rho_ar | -.09329214
           sigma_u |  .01084567
           sigma_e |  .01272081
           rho_fov |  .42093235   (fraction of variance because of u_i)
      ------------------------------------------------------------------------------
      F test that all u_i=0: F(120,6846) = 12.81                   Prob > F = 0.0000
      including the control variables lead to similar results but with less observations (since R&D wasnt available for every quarter for every firm basically. The number of obs with R&D dropped to 800 , thats why I use the model with R&D only for Robustness Checks).

      My professor also told me to look briefly into
      Code:
      -xtabond
      which I did.
      The results do hold under different maxlags and maxdepths.

      Furthermore I also checked whether the results are robust w.r.t. heteroskedasticity and with controls by running the same regression with the code:
      Code:
      newey2 norm_income2 L1_norm_income2 L1_norm_income1 csat csatq L4_rdint conc, lag(1) i(newid) t(date) force noconstant level (95)
      with the following results:
      Code:
      Regression with Newey-West standard errors          Number of obs  =       846
      maximum lag : 1                                     F(  6,   840)  =   1910.89
                                                          Prob > F       =    0.0000
      
      ---------------------------------------------------------------------------------
                      |             Newey-West
         norm_income2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      ----------------+----------------------------------------------------------------
      L1_norm_income2 |   .9044428    .030303    29.85   0.000     .8449644    .9639213
      L1_norm_income1 |  -.0794207   .0338772    -2.34   0.019    -.1459146   -.0129268
                 csat |   .0001698   .0001102     1.54   0.124    -.0000465    .0003861
                csatq |   .0000424   .0000149     2.85   0.005     .0000132    .0000717
             L4_rdint |   .0708861   .0326583     2.17   0.030     .0067846    .1349876
                 conc |    .019009   .0064575     2.94   0.003     .0063343    .0316838
      ---------------------------------------------------------------------------------
      This for model (1)
      Do my interpretations that I presented in my main post hold? (How do I explicitly check whether CS increases might decrease CFs at a certrain level in stata?)

      For model (2) I ran the following baseline regression:

      Code:
      xtregar norm_income2 L1.norm_income2 L1.norm_income1 csat crisis satL satB satLXcrisis satBXcrisis, fe rhotype(dw)
      with the following output:
      Code:
      FE (within) regression with AR(1) disturbances  Number of obs     =      6,971
      Group variable: newid                           Number of groups  =        121
      
      R-sq:                                           Obs per group:
           within  = 0.3255                                         min =         11
           between = 0.9850                                         avg =       57.6
           overall = 0.7011                                         max =         83
      
                                                      F(8,6842)         =     412.64
      corr(u_i, Xb)  = 0.7946                         Prob > F          =     0.0000
      
      ------------------------------------------------------------------------------
      norm_income2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      norm_income2 |
               L1. |   .5232726   .0106194    49.28   0.000     .5024552    .5440899
                   |
      norm_income1 |
               L1. |   .0179756   .0064605     2.78   0.005     .0053111    .0306402
                   |
              csat |   .0002703   .0000669     4.04   0.000     .0001392    .0004015
            crisis |  -.0024342   .0005556    -4.38   0.000    -.0035233   -.0013451
              satL |   .0021554   .0007923     2.72   0.007     .0006023    .0037085
              satB |   .0012704   .0008202     1.55   0.121    -.0003375    .0028782
       satLXcrisis |    .003489   .0015405     2.26   0.024     .0004691    .0065088
       satBXcrisis |   .0026281   .0016063     1.64   0.102    -.0005209     .005777
             _cons |   .0170454   .0004394    38.79   0.000      .016184    .0179068
      -------------+----------------------------------------------------------------
            rho_ar | -.09167241
           sigma_u |  .01088445
           sigma_e |  .01270257
           rho_fov |   .4233738   (fraction of variance because of u_i)
      ------------------------------------------------------------------------------
      F test that all u_i=0: F(120,6842) = 12.85                   Prob > F = 0.0000
      I constructed the interaction terms by hand but it should be the same as letting stata do this right? satL are firms with high CS and satB are companies with low sat. I hope this is not too confusing ^^
      Is it correct to interpret these results that I did mention above under Q2 (That being a satL decreases the impact of a crisis, while being a satB does not exacerbate the effects of a crisis in contrast to the rest of the companies that are neither satB nor satL).

      This would only be the baseline model, and the models with the controls will be done on basis of these regressions.

      I hope this helps and the code and output clarified my initial questions.
      (And perhaps I could have offered some tips on improving the code.)
      I would be very happy if you have any improvements.
      Thank you again for your consideration,

      Damien

      Last edited by Damien Schmidt; 20 Aug 2017, 11:35.

      Comment


      • #4
        So in the -newey2- model the minimum of the parabola is when csat = -2.00, and in the -xtregar- model it's when csat = -11.56. I don't know what the range of values of your csat variable is, but on the guess that 0 is the minimum possible, this puts the minimum safely to the left of the entire range. That in turn implies that the outcome variable increases curvilinearly as a function of csat throughout the observed range.

        Your interpretation of the last model also seems correct.

        I constructed the interaction terms by hand but it should be the same as letting stata do this right?
        As far as you have gone, there is no difference. But typically in models with interaction terms there is interest in graphing the predicted values and calculating the marginal effects of the variables that are involved in interaction. You can do that with what you've got, but it's a lot of work. If you were to redo the model with factor variable notation, it becomes simplicity itself:

        Code:
        xtregar norm_income2 L1.norm_income2 L1.norm_income1 csat i.crisis##(i.satL i.satB), fe rhotype(dw)
        margins crisis#satL crisis#satB
        marginsplot
        margins satL satB, dydx(crisis)

        Comment


        • #5
          Thank you for your response, Mr Schechter,

          So in the -newey2- model the minimum of the parabola is when csat = -2.00, and in the -xtregar- model it's when csat = -11.56. I don't know what the range of values of your csat variable is, but on the guess that 0 is the minimum possible, this puts the minimum safely to the left of the entire range. That in turn implies that the outcome variable increases curvilinearly as a function of csat throughout the observed range.
          .

          Since I wanted to know the impact of a change of CS when CS^2 is 0, I substracted the mean from my CS values, so that 0 lies within the observed values. (Does this makes sense)?
          Therefore, do I need to rerun the regression with the actual values to know whether CF are strictly increasing in CS?

          2.

          For the code you suggested:
          Code:
           
           xtregar norm_income2 L1.norm_income2 L1.norm_income1 csat i.crisis##(i.satL i.satB), fe rhotype(dw) margins crisis#satL crisis#satB marginsplot margins satL satB, dydx(crisis)
          my stata returns:

          Code:
          . FE (within) regression with AR(1) disturbances  Number of obs     =      6,971
          Group variable: newid                           Number of groups  =        121
          
          R-sq:                                           Obs per group:
               within  = 0.3255                                         min =         11
               between = 0.9850                                         avg =       57.6
               overall = 0.7011                                         max =         83
          
                                                          F(8,6842)         =     412.64
          corr(u_i, Xb)  = 0.7946                         Prob > F          =     0.0000
          
          ------------------------------------------------------------------------------
          norm_income2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
          norm_income2 |
                   L1. |   .5232726   .0106194    49.28   0.000     .5024552    .5440899
                       |
          norm_income1 |
                   L1. |   .0179756   .0064605     2.78   0.005     .0053111    .0306402
                       |
                  csat |   .0002703   .0000669     4.04   0.000     .0001392    .0004015
              1.crisis |  -.0024342   .0005556    -4.38   0.000    -.0035233   -.0013451
                1.satL |   .0021554   .0007923     2.72   0.007     .0006023    .0037085
                1.satB |   .0012704   .0008202     1.55   0.121    -.0003375    .0028782
                       |
           crisis#satL |
                  1 1  |    .003489   .0015405     2.26   0.024     .0004691    .0065088
                       |
           crisis#satB |
                  1 1  |   .0026281   .0016063     1.64   0.102    -.0005209     .005777
                       |
                 _cons |   .0170454   .0004394    38.79   0.000      .016184    .0179068
          -------------+----------------------------------------------------------------
                rho_ar | -.09167241
               sigma_u |  .01088445
               sigma_e |  .01270257
               rho_fov |   .4233738   (fraction of variance because of u_i)
          ------------------------------------------------------------------------------
          F test that all u_i=0: F(120,6842) = 12.85                   Prob > F = 0.0000
          
          . 
          . margins crisis#satL crisis#satB
          default prediction is a function of possibly stochastic quantities other than e(b)
          r(498);
          
          . marginsplot
          previous command was not margins
          r(301);
          
          . 
          . margins satL satB, dydx(crisis)
          default prediction is a function of possibly stochastic quantities other than e(b)
          r(498);
          
          . margins crisis#satL crisis#satB
          default prediction is a function of possibly stochastic quantities other than e(b)
          r(498);
          For the first part: Thank you for giving me a better code This will surely help in future stata-analysis.

          For the latter part with the error:
          I checked the help file but coulnd't find a way to circumvent this error. Can you explain what might have gone wrong?
          I searched some in the internet and the only explanantion that I found is that this model contains also random effects. This is totally surpsining, since I explicitly used a fixed-effects model? Could you explain what I might miss here?

          Thank you a lot
          Damien

          Comment


          • #6
            Since I wanted to know the impact of a change of CS when CS^2 is 0, I substracted the mean from my CS values, so that 0 lies within the observed values. (Does this makes sense)?
            Therefore, do I need to rerun the regression with the actual values to know whether CF are strictly increasing in CS?
            Well, that would be one way to do it. But you can just use a little algebra here. If the turning point is when csat = -2, and csat is the actual cs minus the mean, then that's actual cs - mean = -2, actual cs = mean - 2. So just add the value of the mean you originally subtracted to the -2 and -11.56 respectively to get the corresponding turning points in terms of real cs.

            As for the latter part, I'm not really sure what is going wrong here. This message sometimes comes when -margins- is applied to models with random or fixed effects that are not actually estimable. But -xtregar- does support estimation of the intercepts and the errors, so that doesn't seem to be the issue. It may have to do with the lag terms. I'm not really sure, and I don't know what to suggest here. I'm sorry I led you down this path but don't know how to lead you to its end.

            Comment


            • #7
              Hi Mr. Schechter,

              thank you for leading me down the path!

              I actually found a way to circumvent the problem. Its the same way I dealt with the problems of Newey2 and lagged variables.
              I just lagged them by hand and then included those lagged variables into the model. Then the code just worked fine then.
              I hope this may help others in the future that come across the same problem with this code and lagged variables.

              However I have a question w.r.t. the results:
              If the average marginal effect is not significant what does it actually measures? I only found this discussion to shed some light on my question:
              HTML Code:
              https://www.statalist.org/forums/forum/general-stata-discussion/general/1329201-marginal-effects-significance-vs-original-model-effects-significance
              , however I'm not sure on how to transfer the points of the discussion onto my original problems.

              Since the 95% confidence intervals of satL and satB in crisis=1 both contain the 0, the null hypothesis that satL/saB have no effect in times of a crisis can not be rejected?
              This would be a hit to the validity of my model right? I highly appreciate some clarification.

              Here is the output:

              Code:
              . margins satL satB, dydx(crisis)
              
              Average marginal effects                        Number of obs     =      7,092
              
              Expression   : Linear prediction, predict()
              dy/dx w.r.t. : 1.crisis
              
              ------------------------------------------------------------------------------
                           |            Delta-method
                           |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
              0.crisis     |  (base outcome)
              -------------+----------------------------------------------------------------
              1.crisis     |
                      satL |
                        0  |  -.0021492   .0005213    -4.12   0.000    -.0031709   -.0011275
                        1  |   .0013398   .0014432     0.93   0.353    -.0014888    .0041683
                           |
                      satB |
                        0  |  -.0021179   .0005206    -4.07   0.000    -.0031383   -.0010975
                        1  |   .0005102   .0015126     0.34   0.736    -.0024543    .0034748
              ------------------------------------------------------------------------------
              Note: dy/dx for factor levels is the discrete change from the base level.
              Do you have a source where I can read up onto interpreting the outcomes of this test as it is totally new to me?
              As I understand it, it has to do with some non-linear properties of my function which is beyond my current level of statistical knowledge.


              Thank you very much
              Damien

              Comment


              • #8
                You are misreading the table. The figures there are not the marginal effects of satL and satB, they are the marginal effects of crisis, conditional on the values of satL and satB. So, when satL = 0, the marginal effect of crisis is negative somewhat negative, -.002 (95% CI -.003 to -.001), but when satL = 1, the marginal effect of crisis is 0.001 (95% CI -0.001 to +0.004). Analogous interpretations for the marginal effect of crisis when satB = 0 or satB = 1.

                It dawns on me now, however, that I may not have properly understood what satB and satL are. Re-reading your earlier posts, it now seems to me that they are two indicator variables that represent two levels of a 3-level categorical variable. If that is correct, then all of this is wrong. The problem is that you can never have satL and satB both be 1, but -margins- has no way of knowing that and it does calculations without imposing that constraint. Instead, you need to have a single three-level satisfaction variable. Let's just call it sat, 0 = low, 1 = medium, 2 = high.

                Code:
                xtregar outcome i.crisis##i.sat other_variables_etc.
                margins crisis#sat // PREDICTED MARGINS ALL COMBINATIONS OF SAT & CRISIS
                margins sat, dydx(crisis) // MARGINAL EFFECTS OF CRISIS CONDITIONAL ON SAT
                margins crisis, dydx(sat) // MARGINAL EFFECTS OF SAT CONDITIONAL ON CRISIS
                If your hypothesis is supported by the data you will find that the marginal effect of crisis is greatest when sat = 0, somewhat less when sat = 1, and lowest when sat = 2.


                Comment

                Working...
                X