Asymmetric effects in FE model (Interaction of Dummies)

Damien Schmidt

Join Date: Aug 2017

Posts: 4
#1

Asymmetric effects in FE model (Interaction of Dummies)

20 Aug 2017, 09:47

Hi Stata-Forum Members,

I have created a dataset that contains data on 112 companies across 84 quarters containing data on
Customer Satisfaction (CS), Cashflows(CF), Earnings(E) R&D (RD) and Concentration (Conc) of the Industry and whether there was a Crisis (Crisis), a Dummy Variable that is 1 when there was a crisis and 0, otherwise.
The dataset is not balanced since some companies only have data for years. (The average quarters measured is around 60).

Here are the two things that I wanted to test:

1. Does Customer Satisfaction has a positive impact on CashFlows
1.b) Is this relationship non-linear.

2. In times of a crisis, high CS mitigate negative effects of a crisis
2b) In times of a crisis low CS does not exacerbate the negative effects of a crisis

To test these hypothesis I used the following econometrical model:

(1) CF_i,t= a_i,t+b₁*CF_i,t-1+b₂*E_i,t-1+b₃*CS_i,t b₄*CS_i,t²+v_i+e_i,t

where a is the intercept, b_1,2 measure past cashflows and earnings and
b_3,4 are the variables of interest.

I estimated this model with the commands xtregar and since its a dynamic model with xtabond. Both results indicate that b3 and b4 are positive and significant.

(Q1) I hope these findings support my Hypothesis 1 and 1b).

Now comes the problem: I dont want to test whether CS has a general effect in terms of a crisis, but I want to check whether high CS reduces the effects of a crisis and lowCS doesnt exacerbate the effects. Therefore I tested the following model:

(2) CF_i,t= a_i,t+b₁*CF_i,t-1+b₂*E_i,t-1+b₃*CS_i,t b₄*Crisis+b₅*HighCS+b₆*LowCS+b7*Crisis*HighCS+b₈*LowCS*Crisis+v_i+e_i,t

The outcomes indicate that:
b₄ is significant and negative
b₅ is significant and positive
b₆ is insignificant
b₇ is significant and positive
b₈ is significant and positive

HighCS and LowCS are both dummies, since I have 3 groups (HighCS NormalCS LowCS) I included 2 dummies to compare these groups with the group of firms that have averagely satisfied customers.

(Q2) Can I draw the conclusion from my results that when there is a Crisis(Dummy=1) companies that have a high level of customer satisfaction (Dummy=1) suffer less from the negative effects of a crisis?

Im am just so unsure because the Dummy Variable of Crisis mark certain points in time (6 quarters), while the Dummy of HighCustomer satisfaction is coded across the whole dataset with regards to time t.

I already looked at:
https://www.statalist.org/forums/for...dummy-variable
but it didnt clarify to me, whether my model is set-up correctly.

If you need any stata-outputs I can provide them as well.

Kind regards,
Damien
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#2

20 Aug 2017, 10:33

1. Does Customer Satisfaction has a positive impact on CashFlows
1.b) Is this relationship non-linear.
...
To test these hypothesis I used the following econometrical model:

(1) CF_i,t= a_i,t+b₁*CF_i,t-1+b₂*E_i,t-1+b₃*CS_i,t b₄*CS_i,t²+v_i+e_i,t
...
I estimated this model with the commands xtregar and since its a dynamic model with xtabond. Both results indicate that b3 and b4 are positive and significant.

(Q1) I hope these findings support my Hypothesis 1 and 1b).

It does support 1b. As for 1a, maybe and maybe not. Since b4 is positive, the relationship between CF and CS, if graphed, is a parabola, so as CS increases CF actually decreases at first, reaches a minimum when CS = -b3/(2*b4), and then increases after that. So, if -b3/(2*b4) is less than any observed value of CS, you have an increasing curvilinear relationship between CF and CS. But if -b3/(2*b4) is greater than all observed values of CS, then you have a decreasing curvilinear relationship between CF and CS. Finally, if -b3/(2*b4) falls within the range of observed CS values, you have a U-shaped relationship.

(Q2) Can I draw the conclusion from my results that when there is a Crisis(Dummy=1) companies that have a high level of customer satisfaction (Dummy=1) suffer less from the negative effects of a crisis?

That looks correct.

Speaking only for myself, and others may disagree, I find equations with subscripts disagreeable to read. I would have found your post easier to follow had you posted the actual Stata commands you ran and the actual Stata output they led to. A side benefit of doing it that way is that, for now, I can only answer your questions conditional on the assumption that you have actually implemented them correctly in unseen code. Had you shown the code and output, I could answer your questions conditional on what you actually did! (And perhaps I could have offered some tips on improving the code.)
Comment

Damien Schmidt

Join Date: Aug 2017
Posts: 4

20 Aug 2017, 11:32

Hi Mr. Schechter,

first of all, thank you for your consideration and taking time on sunday to help other people with their statistical problems.

Speaking only for myself, and others may disagree, I find equations with subscripts disagreeable to read. I would have found your post easier to follow had you posted the actual Stata commands you ran and the actual Stata output they led to

Thank you for your feedback, below I tried to post the code that I used and the results that it produced:

This was the code that I ran to estimate my model (1) and to check for Hypothesis 1:

Code:

xtregar norm_income2 L1.norm_income2 L1.norm_income1 csat csatq, fe rhotype(dw)

where norm_income2 was CF and norm_icome1 was earnings. Csat is CS and csatq is CS².

Code:

FE (within) regression with AR(1) disturbances  Number of obs     =      6,971
Group variable: newid                           Number of groups  =        121

R-sq:                                           Obs per group:
     within  = 0.3243                                         min =         11
     between = 0.9824                                         avg =       57.6
     overall = 0.6996                                         max =         83

                                                F(4,6846)         =     821.46
corr(u_i, Xb)  = 0.7906                         Prob > F          =     0.0000

------------------------------------------------------------------------------
norm_income2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
norm_income2 |
         L1. |   .5241718   .0106318    49.30   0.000     .5033302    .5450134
             |
norm_income1 |
         L1. |   .0205546   .0064437     3.19   0.001      .007923    .0331862
             |
        csat |   .0003608   .0000618     5.84   0.000     .0002397    .0004818
       csatq |   .0000156   4.86e-06     3.20   0.001     6.04e-06    .0000251
       _cons |   .0166367    .000452    36.81   0.000     .0157507    .0175227
-------------+----------------------------------------------------------------
      rho_ar | -.09329214
     sigma_u |  .01084567
     sigma_e |  .01272081
     rho_fov |  .42093235   (fraction of variance because of u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(120,6846) = 12.81                   Prob > F = 0.0000

including the control variables lead to similar results but with less observations (since R&D wasnt available for every quarter for every firm basically. The number of obs with R&D dropped to 800 , thats why I use the model with R&D only for Robustness Checks).

My professor also told me to look briefly into

Code:

-xtabond

which I did.
The results do hold under different maxlags and maxdepths.

Furthermore I also checked whether the results are robust w.r.t. heteroskedasticity and with controls by running the same regression with the code:

Code:

newey2 norm_income2 L1_norm_income2 L1_norm_income1 csat csatq L4_rdint conc, lag(1) i(newid) t(date) force noconstant level (95)

with the following results:

Code:

Regression with Newey-West standard errors          Number of obs  =       846
maximum lag : 1                                     F(  6,   840)  =   1910.89
                                                    Prob > F       =    0.0000

---------------------------------------------------------------------------------
                |             Newey-West
   norm_income2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
L1_norm_income2 |   .9044428    .030303    29.85   0.000     .8449644    .9639213
L1_norm_income1 |  -.0794207   .0338772    -2.34   0.019    -.1459146   -.0129268
           csat |   .0001698   .0001102     1.54   0.124    -.0000465    .0003861
          csatq |   .0000424   .0000149     2.85   0.005     .0000132    .0000717
       L4_rdint |   .0708861   .0326583     2.17   0.030     .0067846    .1349876
           conc |    .019009   .0064575     2.94   0.003     .0063343    .0316838
---------------------------------------------------------------------------------

This for model (1)
Do my interpretations that I presented in my main post hold? (How do I explicitly check whether CS increases might decrease CFs at a certrain level in stata?)

For model (2) I ran the following baseline regression:

Code:

xtregar norm_income2 L1.norm_income2 L1.norm_income1 csat crisis satL satB satLXcrisis satBXcrisis, fe rhotype(dw)

with the following output:

Code:

FE (within) regression with AR(1) disturbances  Number of obs     =      6,971
Group variable: newid                           Number of groups  =        121

R-sq:                                           Obs per group:
     within  = 0.3255                                         min =         11
     between = 0.9850                                         avg =       57.6
     overall = 0.7011                                         max =         83

                                                F(8,6842)         =     412.64
corr(u_i, Xb)  = 0.7946                         Prob > F          =     0.0000

------------------------------------------------------------------------------
norm_income2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
norm_income2 |
         L1. |   .5232726   .0106194    49.28   0.000     .5024552    .5440899
             |
norm_income1 |
         L1. |   .0179756   .0064605     2.78   0.005     .0053111    .0306402
             |
        csat |   .0002703   .0000669     4.04   0.000     .0001392    .0004015
      crisis |  -.0024342   .0005556    -4.38   0.000    -.0035233   -.0013451
        satL |   .0021554   .0007923     2.72   0.007     .0006023    .0037085
        satB |   .0012704   .0008202     1.55   0.121    -.0003375    .0028782
 satLXcrisis |    .003489   .0015405     2.26   0.024     .0004691    .0065088
 satBXcrisis |   .0026281   .0016063     1.64   0.102    -.0005209     .005777
       _cons |   .0170454   .0004394    38.79   0.000      .016184    .0179068
-------------+----------------------------------------------------------------
      rho_ar | -.09167241
     sigma_u |  .01088445
     sigma_e |  .01270257
     rho_fov |   .4233738   (fraction of variance because of u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(120,6842) = 12.85                   Prob > F = 0.0000

I constructed the interaction terms by hand but it should be the same as letting stata do this right? satL are firms with high CS and satB are companies with low sat. I hope this is not too confusing ^^
Is it correct to interpret these results that I did mention above under Q2 (That being a satL decreases the impact of a crisis, while being a satB does not exacerbate the effects of a crisis in contrast to the rest of the companies that are neither satB nor satL).

This would only be the baseline model, and the models with the controls will be done on basis of these regressions.

I hope this helps and the code and output clarified my initial questions.

(And perhaps I could have offered some tips on improving the code.)

I would be very happy if you have any improvements.
Thank you again for your consideration,

Damien

Last edited by Damien Schmidt; 20 Aug 2017, 11:35.

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#4

20 Aug 2017, 11:55

So in the -newey2- model the minimum of the parabola is when csat = -2.00, and in the -xtregar- model it's when csat = -11.56. I don't know what the range of values of your csat variable is, but on the guess that 0 is the minimum possible, this puts the minimum safely to the left of the entire range. That in turn implies that the outcome variable increases curvilinearly as a function of csat throughout the observed range.

Your interpretation of the last model also seems correct.

I constructed the interaction terms by hand but it should be the same as letting stata do this right?

As far as you have gone, there is no difference. But typically in models with interaction terms there is interest in graphing the predicted values and calculating the marginal effects of the variables that are involved in interaction. You can do that with what you've got, but it's a lot of work. If you were to redo the model with factor variable notation, it becomes simplicity itself:

Code:

xtregar norm_income2 L1.norm_income2 L1.norm_income1 csat i.crisis##(i.satL i.satB), fe rhotype(dw) margins crisis#satL crisis#satB marginsplot margins satL satB, dydx(crisis)
Comment

Damien Schmidt

Join Date: Aug 2017
Posts: 4

20 Aug 2017, 13:04

Thank you for your response, Mr Schechter,

So in the -newey2- model the minimum of the parabola is when csat = -2.00, and in the -xtregar- model it's when csat = -11.56. I don't know what the range of values of your csat variable is, but on the guess that 0 is the minimum possible, this puts the minimum safely to the left of the entire range. That in turn implies that the outcome variable increases curvilinearly as a function of csat throughout the observed range.

.

Since I wanted to know the impact of a change of CS when CS^2 is 0, I substracted the mean from my CS values, so that 0 lies within the observed values. (Does this makes sense)?
Therefore, do I need to rerun the regression with the actual values to know whether CF are strictly increasing in CS?

2.

For the code you suggested:

Code:

 
 xtregar norm_income2 L1.norm_income2 L1.norm_income1 csat i.crisis##(i.satL i.satB), fe rhotype(dw) margins crisis#satL crisis#satB marginsplot margins satL satB, dydx(crisis)

my stata returns:

Code:

. FE (within) regression with AR(1) disturbances  Number of obs     =      6,971
Group variable: newid                           Number of groups  =        121

R-sq:                                           Obs per group:
     within  = 0.3255                                         min =         11
     between = 0.9850                                         avg =       57.6
     overall = 0.7011                                         max =         83

                                                F(8,6842)         =     412.64
corr(u_i, Xb)  = 0.7946                         Prob > F          =     0.0000

------------------------------------------------------------------------------
norm_income2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
norm_income2 |
         L1. |   .5232726   .0106194    49.28   0.000     .5024552    .5440899
             |
norm_income1 |
         L1. |   .0179756   .0064605     2.78   0.005     .0053111    .0306402
             |
        csat |   .0002703   .0000669     4.04   0.000     .0001392    .0004015
    1.crisis |  -.0024342   .0005556    -4.38   0.000    -.0035233   -.0013451
      1.satL |   .0021554   .0007923     2.72   0.007     .0006023    .0037085
      1.satB |   .0012704   .0008202     1.55   0.121    -.0003375    .0028782
             |
 crisis#satL |
        1 1  |    .003489   .0015405     2.26   0.024     .0004691    .0065088
             |
 crisis#satB |
        1 1  |   .0026281   .0016063     1.64   0.102    -.0005209     .005777
             |
       _cons |   .0170454   .0004394    38.79   0.000      .016184    .0179068
-------------+----------------------------------------------------------------
      rho_ar | -.09167241
     sigma_u |  .01088445
     sigma_e |  .01270257
     rho_fov |   .4233738   (fraction of variance because of u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(120,6842) = 12.85                   Prob > F = 0.0000

. 
. margins crisis#satL crisis#satB
default prediction is a function of possibly stochastic quantities other than e(b)
r(498);

. marginsplot
previous command was not margins
r(301);

. 
. margins satL satB, dydx(crisis)
default prediction is a function of possibly stochastic quantities other than e(b)
r(498);

. margins crisis#satL crisis#satB
default prediction is a function of possibly stochastic quantities other than e(b)
r(498);

For the first part: Thank you for giving me a better code

This will surely help in future stata-analysis.

For the latter part with the error:
I checked the help file but coulnd't find a way to circumvent this error. Can you explain what might have gone wrong?
I searched some in the internet and the only explanantion that I found is that this model contains also random effects. This is totally surpsining, since I explicitly used a fixed-effects model? Could you explain what I might miss here?

Thank you a lot
Damien

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#6

20 Aug 2017, 13:32

Since I wanted to know the impact of a change of CS when CS^2 is 0, I substracted the mean from my CS values, so that 0 lies within the observed values. (Does this makes sense)?
Therefore, do I need to rerun the regression with the actual values to know whether CF are strictly increasing in CS?

Well, that would be one way to do it. But you can just use a little algebra here. If the turning point is when csat = -2, and csat is the actual cs minus the mean, then that's actual cs - mean = -2, actual cs = mean - 2. So just add the value of the mean you originally subtracted to the -2 and -11.56 respectively to get the corresponding turning points in terms of real cs.

As for the latter part, I'm not really sure what is going wrong here. This message sometimes comes when -margins- is applied to models with random or fixed effects that are not actually estimable. But -xtregar- does support estimation of the intercepts and the errors, so that doesn't seem to be the issue. It may have to do with the lag terms. I'm not really sure, and I don't know what to suggest here. I'm sorry I led you down this path but don't know how to lead you to its end.
Comment
Damien Schmidt

Join Date: Aug 2017

Posts: 4
#7

20 Aug 2017, 14:42

Hi Mr. Schechter,

thank you for leading me down the path!

I actually found a way to circumvent the problem. Its the same way I dealt with the problems of Newey2 and lagged variables.
I just lagged them by hand and then included those lagged variables into the model. Then the code just worked fine then.
I hope this may help others in the future that come across the same problem with this code and lagged variables.

However I have a question w.r.t. the results:
If the average marginal effect is not significant what does it actually measures? I only found this discussion to shed some light on my question:

HTML Code:

https://www.statalist.org/forums/forum/general-stata-discussion/general/1329201-marginal-effects-significance-vs-original-model-effects-significance

, however I'm not sure on how to transfer the points of the discussion onto my original problems.

Since the 95% confidence intervals of satL and satB in crisis=1 both contain the 0, the null hypothesis that satL/saB have no effect in times of a crisis can not be rejected?
This would be a hit to the validity of my model right? I highly appreciate some clarification.

Here is the output:

Code:

. margins satL satB, dydx(crisis) Average marginal effects Number of obs = 7,092 Expression : Linear prediction, predict() dy/dx w.r.t. : 1.crisis ------------------------------------------------------------------------------ | Delta-method | dy/dx Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- 0.crisis | (base outcome) -------------+---------------------------------------------------------------- 1.crisis | satL | 0 | -.0021492 .0005213 -4.12 0.000 -.0031709 -.0011275 1 | .0013398 .0014432 0.93 0.353 -.0014888 .0041683 | satB | 0 | -.0021179 .0005206 -4.07 0.000 -.0031383 -.0010975 1 | .0005102 .0015126 0.34 0.736 -.0024543 .0034748 ------------------------------------------------------------------------------ Note: dy/dx for factor levels is the discrete change from the base level.

Do you have a source where I can read up onto interpreting the outcomes of this test as it is totally new to me?
As I understand it, it has to do with some non-linear properties of my function which is beyond my current level of statistical knowledge.

Thank you very much
Damien
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#8

20 Aug 2017, 15:17

You are misreading the table. The figures there are not the marginal effects of satL and satB, they are the marginal effects of crisis, conditional on the values of satL and satB. So, when satL = 0, the marginal effect of crisis is negative somewhat negative, -.002 (95% CI -.003 to -.001), but when satL = 1, the marginal effect of crisis is 0.001 (95% CI -0.001 to +0.004). Analogous interpretations for the marginal effect of crisis when satB = 0 or satB = 1.

It dawns on me now, however, that I may not have properly understood what satB and satL are. Re-reading your earlier posts, it now seems to me that they are two indicator variables that represent two levels of a 3-level categorical variable. If that is correct, then all of this is wrong. The problem is that you can never have satL and satB both be 1, but -margins- has no way of knowing that and it does calculations without imposing that constraint. Instead, you need to have a single three-level satisfaction variable. Let's just call it sat, 0 = low, 1 = medium, 2 = high.

Code:

xtregar outcome i.crisis##i.sat other_variables_etc. margins crisis#sat // PREDICTED MARGINS ALL COMBINATIONS OF SAT & CRISIS margins sat, dydx(crisis) // MARGINAL EFFECTS OF CRISIS CONDITIONAL ON SAT margins crisis, dydx(sat) // MARGINAL EFFECTS OF SAT CONDITIONAL ON CRISIS

If your hypothesis is supported by the data you will find that the marginal effect of crisis is greatest when sat = 0, somewhat less when sat = 1, and lowest when sat = 2.
Comment

Announcement