Directional symmetry

Omar Shaher

Join Date: Feb 2019

Posts: 164
#1

Directional symmetry

20 Dec 2019, 16:57

Dear Statalist

I saw many researchers decomposing the variables into positive and negative values. For instance, if we are examining the impact of growth in sales on trade credit, before the decomposing, the results show that the coefficient of the growth is negative, so to have additional information about the effect of the variable, they decompose the growth to positive growth and negative growth, so the former takes positive values of sales growth, and 0 otherwise, and the latter takes negative values, and 0 otherwise, and this can be done by the following code:

generate growth_positive =gorwth*(growth>0)
generate growth_negative= -growth*(growth<0)

So, my question why they decompose it, is it because the growth coefficient before the decomposition was negative, if yes, shall we decompose the growth if we get a positive coefficient?

Devoting some of your valuable time to answering my question is highly appreciated.

Looking forward to hearing from you.

Many thanks in advance.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#2

20 Dec 2019, 18:33

Well, to really know why somebody did this, you would have to ask that person.

In general, if you plan to use some variable, like growth, as a predictor in a model and if you have reason to think that the effect of growth would be different when growth is positive than it is when growth is negative, then it would make sense to have two separate variables like this. IT really has nothing at all to do with whether the coefficient of the growth variable itself turned out to be negative in some other analysis.

So just think about whether positive growth and negative growth would be expected to have different effects on whatever outcome you are modeling. If so, two variables should be used. If not, leave it as one variable.

By the way, another way to get these same results is with:

Code:

mkspline growth_positive 0 growth_negative = growth
Comment
Omar Shaher

Join Date: Feb 2019

Posts: 164
#3

20 Dec 2019, 19:36

Dear Prof. Clyde

My deepest respect to you and for your consideration. Thank you very much for your answer. Greatly appreciated.

Yes, indeed. What you have mentioned makes sense for me but the thing is there is an interaction between the growth variable with a dummy variable "moderator" that has been coded one for observations after an event and zero before that event. Thus, if I do the decomposition and running the model with the interaction; a huge number of observations will be missing. To illustrate, I am using unbalanced panel data, and the number of observations for the growth is around 1200 but when I run the code for the decomposition; the STATA tells me that 818 observations are missing, then I do run the interaction model and I get results but these results are confusing.
So, I am a bit confused
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#4

20 Dec 2019, 19:53

You're losing me here. I don't know what model you are talking about. If you show the actual exact code you are running and the actual exact output that Stata is giving you, it will be easier to give advice. When responding, please put the code and output inside code delimiters so they will align properly and be easy to read. If you are not familiar with using code delimiters, read forum FAQ #12 for instructions.
Comment

Omar Shaher

Join Date: Feb 2019
Posts: 164

21 Dec 2019, 17:41

Dear Prof. Clyde,

Thank you very much for your reply.

The model that I am talking about is the generalized DID model as below:

Code:

xtreg performance i.Event##(c.centered_Age c.centered_SIZE  c.growth_positive c.growth_negative  ) i.Year, fe cluster ( COMPANY)

Firstly, I have centered the variables.
Secondly, I have used this code as you have mentioned above to decompose the growth variable into positive and negative values.

Code:

 mkspline growth_positive 0 growth_negative = centered_Growth

Thirdly, I have got the following results:

Code:

-----------------------------------------------------------------------------------------
                        |               Robust
            performance |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
------------------------+----------------------------------------------------------------
                1.Event |   .0013532   .0156817     0.09   0.933    -.0335879    .0362942
           centered_Age |  -.9940258   .2249671    -4.42   0.001    -1.495284   -.4927678
          centered_SIZE |  -.0279756   .1013209    -0.28   0.788    -.2537327    .1977815
        growth_positive |   .1230705   .0754792     1.63   0.134    -.0451078    .2912487
        growth_negative |  -.0231748   .0113326    -2.04   0.068    -.0484254    .0020758
                        |
   Event#c.centered_Age |
                     1  |  -.0797421   .0460939    -1.73   0.114    -.1824458    .0229616
                        |
  Event#c.centered_SIZE |
                     1  |  -.0573542    .057244    -1.00   0.340    -.1849017    .0701933
                        |
Event#c.growth_positive |
                     1  |   .0064531   .0619437     0.10   0.919    -.1315661    .1444723
                        |
Event#c.growth_negative |
                     1  |   .0285022   .0088922     3.21   0.009     .0086891    .0483153
                        |
                   Year |
                  2010  |  -.0148423   .0294349    -0.50   0.625    -.0804273    .0507426
                  2011  |    .014526   .0401351     0.36   0.725    -.0749005    .1039526
                  2012  |   .0248569   .0333481     0.75   0.473    -.0494474    .0991612
                  2013  |   .0479541   .0319983     1.50   0.165    -.0233427    .1192508
                  2014  |   .0544079   .0300797     1.81   0.101    -.0126138    .1214296
                  2015  |   .0692569   .0391678     1.77   0.107    -.0180144    .1565281
                  2016  |   .0866501   .0474384     1.83   0.098    -.0190492    .1923495
                  2017  |   .0923255   .0517074     1.79   0.104    -.0228856    .2075367
                  2018  |   .1456772    .060157     2.42   0.036     .0116391    .2797152
                        |
                  _cons |   .0871435   .0385115     2.26   0.047     .0013347    .1729524
------------------------+----------------------------------------------------------------
                sigma_u |   .2248403
                sigma_e |  .03891901
                    rho |  .97090931   (fraction of variance due to u_i)
-----------------------------------------------------------------------------------------

So, my main question is:
Is it fine to do the decomposition for the growth variable even that there is an interaction in the model?

Thank you very much in advance.

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#6

21 Dec 2019, 17:55

Yes, it is fine. The question is whether it is necessary and meaningful; those questions are based on your interpretation of this output, whether the differences in the growth_positive and growth_negative terms (and their interactions with Event) are large enough to matter from a practical perspective.
Comment
Omar Shaher

Join Date: Feb 2019

Posts: 164
#7

21 Dec 2019, 18:00

Dear Prof. Clyde,

I can't thank you enough for your great efforts. Greatly appreciated.

My deepest respect to you.
Comment

Omar Shaher

Join Date: Feb 2019
Posts: 164

31 Dec 2019, 08:59

Dear Prof. Clyde,
I am so sorry for asking again, but I am a bit confused about this matter.
As you can see the following results of the generalized DID model.

Code:

--------------------------------------------------------------------------------
               |               Robust
   performance |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
       1.Event |   .9102377   .3949668     2.30   0.055    -.0237104    1.844186
           Age |  -1.248587   .2453343    -5.09   0.001    -1.828711   -.6684635
          size |   .1273543   .0415022     3.07   0.018     .0292172    .2254915
        growth |  -.0497433   .0701998    -0.71   0.501    -.2157395    .1162528
               |
   Event#c.Age |
            1  |    -.05529    .050027    -1.11   0.306    -.1735851    .0630051
               |
  Event#c.size |
            1  |  -.1086234   .0452957    -2.40   0.048    -.2157308    -.001516
               |
Event#c.growth |
            1  |  -.1197868   .0581045    -2.06   0.078    -.2571822    .0176086
               |
          Year |
         2010  |  -.0267569    .026015    -1.03   0.338    -.0882725    .0347588
         2011  |   .0091717    .030344     0.30   0.771    -.0625804    .0809238
         2012  |   .0282719   .0247852     1.14   0.292    -.0303357    .0868795
         2013  |   .0634201   .0306809     2.07   0.078    -.0091285    .1359688
         2014  |   .0683174   .0228519     2.99   0.020     .0142813    .1223535
         2015  |   .0713287   .0368078     1.94   0.094     -.015708    .1583654
         2016  |   .1047946   .0440577     2.38   0.049     .0006146    .2089746
         2017  |   .1126851   .0417439     2.70   0.031     .0139764    .2113937
         2018  |   .1473622    .055554     2.65   0.033     .0159979    .2787265
               |
         _cons |   2.133815   .6458033     3.30   0.013     .6067325    3.660897
---------------+----------------------------------------------------------------
       sigma_u |  .33539484
       sigma_e |  .03676929
           rho |  .98812403   (fraction of variance due to u_i)
-

As you can notice that size before the event was positive and after the event is negative. I know that means that the negative coefficient for event*size says that once the event takes effect, the slope of the performance: size relationship decreases. That may mean that it is less positive than before the event kicks in, or it might even go all the way to a negative value.

But, when I have resorted to the literature, in similar cases, they found the same thing (i.e. positive before the event and negative after the event). Where the first school of the literature stated that a decrease in size, after the adoption, causes an increase in the performance. While, the second school of literature stated that an increase in size, after the adoption, causes a decrease in the performance.

I am a bit confused because as far as I know that the effect is on the slope of the relationship not on the variables, but how to decide the effect on variables?
Do you think, I should do the independent sample t-test for size before and after, and for performance before and after and noticing the average mean value before and after the event? I did that, and I found that both average values for size and performance are lower after the event, where the average values decreased significantly for the former, and slightly for the latter. However, there is no significant difference for the size.
What do you think, I am a bit confused because I need to find out which variable increases and which one decreases after the event?

Devoting some of your valuable time to answering my question is highly appreciated.

Many thanks in advance.

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#9

31 Dec 2019, 11:14

In your model the effect of size before the event is about +0.91, and after the event it is about 0.91 - 0.12, or about 0.79. So it is positive in both conditions, but smaller after the event. To get these out of Stata, after your regression, use the -margins- command:

Code:

margins Event, dydx(size)

If you are specifically interested in the difference in effect between before and after the event, then that you get directly from your regression output as the coefficient of 1.Event#c.growth = -0.12 (to 2 decimal places) with a 95% CI of -0.26 to + 0.02 (again to 2 decimal places).

I think that statistics showing the distribution of size before and after the event are interesting background that your audience would likely want to know, but they do not shed light on the effect of size on performance in either condition, nor on whether that effect changed after the event.
Comment
Omar Shaher

Join Date: Feb 2019

Posts: 164
#10

31 Dec 2019, 13:57

Dear Prof. Clyde,

Thank you very much for your reply. Greatly appreciated.

(1) That’s very interesting, so what does the margin exactly telling me?

(2) After conducting the results of margin, as below, if the effect of size on performance is positive, so what does that mean? Does it mean an increase in size causes an increase in performance?

Code:

------------------------------------------------------------------------------ | Delta-method | dy/dx Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- Age | Event | 0 | -1.248587 .2453343 -5.09 0.000 -1.729434 -.7677406 1 | -1.303877 .2888089 -4.51 0.000 -1.869932 -.7378219 -------------+---------------------------------------------------------------- size | Event | 0 | .1273543 .0415022 3.07 0.002 .0460115 .2086972 1 | .018731 .0054714 3.42 0.001 .0080072 .0294547 -------------+---------------------------------------------------------------- growth | Event | 0 | -.0497433 .0701998 -0.71 0.479 -.1873324 .0878458 1 | -.1695302 .049969 -3.39 0.001 -.2674675 -.0715928 ------------------------------------------------------------------------------

(3) Let’s look for Age, although the Age after the event is not significant in #8, but in the results of margin, it is significant!!! Is that OK? OR the results should be consistent?

(4) Sure, but in statistics that showing the distribution of size before and after (the independent sample t-test), any differences in size distribution will be attributed to the EVENT if the difference is statistically significant, but what if there is a difference in average values and the difference is not statistically significant? Does that mean the difference in average values is attributed to something else?

I am so sorry for my questions and for any inconvenience caused but I am still confused, I want to know which variable increased and which variable decreased after the adoption (the dependent or the independent)?

Million thanks in advance.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#11

31 Dec 2019, 15:14

So looking at age, you have a negative effect both before and after the event. It means that, all else equal, increasing age is associated with decreasing performance, both before and after the event. (Don't speak of causation--this is observational data.) There is no inconsistency between the margins output and the output in #8. The regression coefficient of Age is equal to what you see for the marginal effect of Age when Event = 0, as it should be. The marginal effect of Age when Event = 1 does not correspond to anything that would appear in the regression output: it is the sum of the regression coefficients of Age and 1.Event#Age. And if you add the -1.248... and the -0.05529 from #8, you do indeed get the -1.303877 that you see in the -margins output. Everything is fitting together just as it should.

4) Sure, but in statistics that showing the distribution of size before and after (the independent sample t-test), any differences in size distribution will be attributed to the EVENT if the difference is statistically significant, but what if there is a difference in average values and the difference is not statistically significant? Does that mean the difference in average values is attributed to something else?

None of this is correct. First of all you are interpreting statistical significance in a way that is completely wrong. (I won't here give you my long rant about why you shouldn't even be thinking about statistical significance in the first place.) Suffice it to say that differences in the distribution of size before and after the event do not necessarily get attributed to the event itself, and whether the differences are "statistically significant" or not makes no difference in this regard. The differences may well represent the playing out of trends that had nothing to do with the event. If most firms were growing over the entire period of your study, then you would see a pre-post event difference that has nothing to do with the event.
Comment
Omar Shaher

Join Date: Feb 2019

Posts: 164
#12

31 Dec 2019, 19:26

Dear Prof. Clyde,
Thank you very much for your answers.
Now, the image is becoming more clearer than before. I am so grateful to you.

So, to sum up:
The results of the coefficient for Age in regression output when event= 0 is before the event.

The results of the coefficient for Event#c.Age in regression output when event= 1 is the INCREMENTAL decrease after the event.

The result in margin output for Age when event = 1 is the effect of age when event =1.

The margin output table tells me the effect of the independent variables on the dependent variables before and after the event because the margin output shows me the coefficient values after the summation when event=1. So, it better to look at the margin output if I am not interested in the differences in effect.

Most importantly, let’s assume that the dy/dx for Age when event = 0 is positive, which means increasing age is associated with an increased performance before the event. And, if the dy/dx for Age when event = 1 is negative, that means increasing age is associated with decreasing performance after the event OR we should say decreasing age is associated with increasing performance AFTER the event???

The independent sample t-test tells me the difference in distributions before and after the event and it is not necessary that the difference is attributed to the event.

If any of the points above are incorrect, please let me know, and I need your answer for point (5) as well, please.

Thank you very much.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#13

31 Dec 2019, 19:32

All of the above points are correct.

Regarding 5., whether you say that increasing age is associated with decreasing performance after the event or that decreasing age is associated with increasing performance event is a matter of taste or emphasis. They mean exactly the same thing.
Comment
Omar Shaher

Join Date: Feb 2019

Posts: 164
#14

01 Jan 2020, 08:31

Dear Prof. Clyde,

I have learned from you so much, I don't know how to thank you. I will be grateful to you for the rest of my life.

If there is no statistical method to determine which variable that has decreased or increased (attributed to event) and that the goal of the DID method is to examine the impact on the relationship, then I think I can separate the observations for all companies before the event and conduct a multivariate regression analysis, and then separate the observations for all companies after the event and conduct a multivariate regression analysis without including the event variable in any of both models. And, the results should be qualitatively similar.

Do you think this makes sense or not?

Thank you.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#15

01 Jan 2020, 14:48

You have described this in fairly vague general terms, so I don't really know what you have in mind here. If you showed specific code it would be possible to comment.
Comment

Announcement