Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Directional symmetry

    Dear Statalist

    I saw many researchers decomposing the variables into positive and negative values. For instance, if we are examining the impact of growth in sales on trade credit, before the decomposing, the results show that the coefficient of the growth is negative, so to have additional information about the effect of the variable, they decompose the growth to positive growth and negative growth, so the former takes positive values of sales growth, and 0 otherwise, and the latter takes negative values, and 0 otherwise, and this can be done by the following code:

    generate growth_positive =gorwth*(growth>0)
    generate growth_negative= -growth*(growth<0)

    So, my question why they decompose it, is it because the growth coefficient before the decomposition was negative, if yes, shall we decompose the growth if we get a positive coefficient?

    Devoting some of your valuable time to answering my question is highly appreciated.

    Looking forward to hearing from you.

    Many thanks in advance.

  • #2
    Well, to really know why somebody did this, you would have to ask that person.

    In general, if you plan to use some variable, like growth, as a predictor in a model and if you have reason to think that the effect of growth would be different when growth is positive than it is when growth is negative, then it would make sense to have two separate variables like this. IT really has nothing at all to do with whether the coefficient of the growth variable itself turned out to be negative in some other analysis.

    So just think about whether positive growth and negative growth would be expected to have different effects on whatever outcome you are modeling. If so, two variables should be used. If not, leave it as one variable.

    By the way, another way to get these same results is with:
    Code:
    mkspline growth_positive 0 growth_negative = growth

    Comment


    • #3
      Dear Prof. Clyde

      My deepest respect to you and for your consideration. Thank you very much for your answer. Greatly appreciated.

      Yes, indeed. What you have mentioned makes sense for me but the thing is there is an interaction between the growth variable with a dummy variable "moderator" that has been coded one for observations after an event and zero before that event. Thus, if I do the decomposition and running the model with the interaction; a huge number of observations will be missing. To illustrate, I am using unbalanced panel data, and the number of observations for the growth is around 1200 but when I run the code for the decomposition; the STATA tells me that 818 observations are missing, then I do run the interaction model and I get results but these results are confusing.
      So, I am a bit confused

      Comment


      • #4
        You're losing me here. I don't know what model you are talking about. If you show the actual exact code you are running and the actual exact output that Stata is giving you, it will be easier to give advice. When responding, please put the code and output inside code delimiters so they will align properly and be easy to read. If you are not familiar with using code delimiters, read forum FAQ #12 for instructions.

        Comment


        • #5
          Dear Prof. Clyde,

          Thank you very much for your reply.

          The model that I am talking about is the generalized DID model as below:
          Code:
          xtreg performance i.Event##(c.centered_Age c.centered_SIZE  c.growth_positive c.growth_negative  ) i.Year, fe cluster ( COMPANY)
          Firstly, I have centered the variables.
          Secondly, I have used this code as you have mentioned above to decompose the growth variable into positive and negative values.

          Code:
           mkspline growth_positive 0 growth_negative = centered_Growth
          Thirdly, I have got the following results:
          Code:
          -----------------------------------------------------------------------------------------
                                  |               Robust
                      performance |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          ------------------------+----------------------------------------------------------------
                          1.Event |   .0013532   .0156817     0.09   0.933    -.0335879    .0362942
                     centered_Age |  -.9940258   .2249671    -4.42   0.001    -1.495284   -.4927678
                    centered_SIZE |  -.0279756   .1013209    -0.28   0.788    -.2537327    .1977815
                  growth_positive |   .1230705   .0754792     1.63   0.134    -.0451078    .2912487
                  growth_negative |  -.0231748   .0113326    -2.04   0.068    -.0484254    .0020758
                                  |
             Event#c.centered_Age |
                               1  |  -.0797421   .0460939    -1.73   0.114    -.1824458    .0229616
                                  |
            Event#c.centered_SIZE |
                               1  |  -.0573542    .057244    -1.00   0.340    -.1849017    .0701933
                                  |
          Event#c.growth_positive |
                               1  |   .0064531   .0619437     0.10   0.919    -.1315661    .1444723
                                  |
          Event#c.growth_negative |
                               1  |   .0285022   .0088922     3.21   0.009     .0086891    .0483153
                                  |
                             Year |
                            2010  |  -.0148423   .0294349    -0.50   0.625    -.0804273    .0507426
                            2011  |    .014526   .0401351     0.36   0.725    -.0749005    .1039526
                            2012  |   .0248569   .0333481     0.75   0.473    -.0494474    .0991612
                            2013  |   .0479541   .0319983     1.50   0.165    -.0233427    .1192508
                            2014  |   .0544079   .0300797     1.81   0.101    -.0126138    .1214296
                            2015  |   .0692569   .0391678     1.77   0.107    -.0180144    .1565281
                            2016  |   .0866501   .0474384     1.83   0.098    -.0190492    .1923495
                            2017  |   .0923255   .0517074     1.79   0.104    -.0228856    .2075367
                            2018  |   .1456772    .060157     2.42   0.036     .0116391    .2797152
                                  |
                            _cons |   .0871435   .0385115     2.26   0.047     .0013347    .1729524
          ------------------------+----------------------------------------------------------------
                          sigma_u |   .2248403
                          sigma_e |  .03891901
                              rho |  .97090931   (fraction of variance due to u_i)
          -----------------------------------------------------------------------------------------
          So, my main question is:
          Is it fine to do the decomposition for the growth variable even that there is an interaction in the model?

          Thank you very much in advance.

          Comment


          • #6
            Yes, it is fine. The question is whether it is necessary and meaningful; those questions are based on your interpretation of this output, whether the differences in the growth_positive and growth_negative terms (and their interactions with Event) are large enough to matter from a practical perspective.

            Comment


            • #7
              Dear Prof. Clyde,

              I can't thank you enough for your great efforts. Greatly appreciated.

              My deepest respect to you.

              Comment


              • #8
                Dear Prof. Clyde,
                I am so sorry for asking again, but I am a bit confused about this matter.
                As you can see the following results of the generalized DID model.
                Code:
                --------------------------------------------------------------------------------
                               |               Robust
                   performance |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                ---------------+----------------------------------------------------------------
                       1.Event |   .9102377   .3949668     2.30   0.055    -.0237104    1.844186
                           Age |  -1.248587   .2453343    -5.09   0.001    -1.828711   -.6684635
                          size |   .1273543   .0415022     3.07   0.018     .0292172    .2254915
                        growth |  -.0497433   .0701998    -0.71   0.501    -.2157395    .1162528
                               |
                   Event#c.Age |
                            1  |    -.05529    .050027    -1.11   0.306    -.1735851    .0630051
                               |
                  Event#c.size |
                            1  |  -.1086234   .0452957    -2.40   0.048    -.2157308    -.001516
                               |
                Event#c.growth |
                            1  |  -.1197868   .0581045    -2.06   0.078    -.2571822    .0176086
                               |
                          Year |
                         2010  |  -.0267569    .026015    -1.03   0.338    -.0882725    .0347588
                         2011  |   .0091717    .030344     0.30   0.771    -.0625804    .0809238
                         2012  |   .0282719   .0247852     1.14   0.292    -.0303357    .0868795
                         2013  |   .0634201   .0306809     2.07   0.078    -.0091285    .1359688
                         2014  |   .0683174   .0228519     2.99   0.020     .0142813    .1223535
                         2015  |   .0713287   .0368078     1.94   0.094     -.015708    .1583654
                         2016  |   .1047946   .0440577     2.38   0.049     .0006146    .2089746
                         2017  |   .1126851   .0417439     2.70   0.031     .0139764    .2113937
                         2018  |   .1473622    .055554     2.65   0.033     .0159979    .2787265
                               |
                         _cons |   2.133815   .6458033     3.30   0.013     .6067325    3.660897
                ---------------+----------------------------------------------------------------
                       sigma_u |  .33539484
                       sigma_e |  .03676929
                           rho |  .98812403   (fraction of variance due to u_i)
                -
                As you can notice that size before the event was positive and after the event is negative. I know that means that the negative coefficient for event*size says that once the event takes effect, the slope of the performance: size relationship decreases. That may mean that it is less positive than before the event kicks in, or it might even go all the way to a negative value.

                But, when I have resorted to the literature, in similar cases, they found the same thing (i.e. positive before the event and negative after the event). Where the first school of the literature stated that a decrease in size, after the adoption, causes an increase in the performance. While, the second school of literature stated that an increase in size, after the adoption, causes a decrease in the performance.

                I am a bit confused because as far as I know that the effect is on the slope of the relationship not on the variables, but how to decide the effect on variables?
                Do you think, I should do the independent sample t-test for size before and after, and for performance before and after and noticing the average mean value before and after the event? I did that, and I found that both average values for size and performance are lower after the event, where the average values decreased significantly for the former, and slightly for the latter. However, there is no significant difference for the size.
                What do you think, I am a bit confused because I need to find out which variable increases and which one decreases after the event?

                Devoting some of your valuable time to answering my question is highly appreciated.

                Many thanks in advance.

                Comment


                • #9
                  In your model the effect of size before the event is about +0.91, and after the event it is about 0.91 - 0.12, or about 0.79. So it is positive in both conditions, but smaller after the event. To get these out of Stata, after your regression, use the -margins- command:

                  Code:
                  margins Event, dydx(size)
                  If you are specifically interested in the difference in effect between before and after the event, then that you get directly from your regression output as the coefficient of 1.Event#c.growth = -0.12 (to 2 decimal places) with a 95% CI of -0.26 to + 0.02 (again to 2 decimal places).

                  I think that statistics showing the distribution of size before and after the event are interesting background that your audience would likely want to know, but they do not shed light on the effect of size on performance in either condition, nor on whether that effect changed after the event.

                  Comment


                  • #10
                    Dear Prof. Clyde,

                    Thank you very much for your reply. Greatly appreciated.

                    (1) That’s very interesting, so what does the margin exactly telling me?

                    (2) After conducting the results of margin, as below, if the effect of size on performance is positive, so what does that mean? Does it mean an increase in size causes an increase in performance?

                    Code:
                    ------------------------------------------------------------------------------
                                 |            Delta-method
                                 |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                    Age          |
                           Event |
                              0  |  -1.248587   .2453343    -5.09   0.000    -1.729434   -.7677406
                              1  |  -1.303877   .2888089    -4.51   0.000    -1.869932   -.7378219
                    -------------+----------------------------------------------------------------
                    size         |
                           Event |
                              0  |   .1273543   .0415022     3.07   0.002     .0460115    .2086972
                              1  |    .018731   .0054714     3.42   0.001     .0080072    .0294547
                    -------------+----------------------------------------------------------------
                    growth       |
                           Event |
                              0  |  -.0497433   .0701998    -0.71   0.479    -.1873324    .0878458
                              1  |  -.1695302    .049969    -3.39   0.001    -.2674675   -.0715928
                    ------------------------------------------------------------------------------

                    (3) Let’s look for Age, although the Age after the event is not significant in #8, but in the results of margin, it is significant!!! Is that OK? OR the results should be consistent?

                    (4) Sure, but in statistics that showing the distribution of size before and after (the independent sample t-test), any differences in size distribution will be attributed to the EVENT if the difference is statistically significant, but what if there is a difference in average values and the difference is not statistically significant? Does that mean the difference in average values is attributed to something else?

                    I am so sorry for my questions and for any inconvenience caused but I am still confused, I want to know which variable increased and which variable decreased after the adoption (the dependent or the independent)?

                    Million thanks in advance.

                    Comment


                    • #11
                      So looking at age, you have a negative effect both before and after the event. It means that, all else equal, increasing age is associated with decreasing performance, both before and after the event. (Don't speak of causation--this is observational data.) There is no inconsistency between the margins output and the output in #8. The regression coefficient of Age is equal to what you see for the marginal effect of Age when Event = 0, as it should be. The marginal effect of Age when Event = 1 does not correspond to anything that would appear in the regression output: it is the sum of the regression coefficients of Age and 1.Event#Age. And if you add the -1.248... and the -0.05529 from #8, you do indeed get the -1.303877 that you see in the -margins output. Everything is fitting together just as it should.

                      4) Sure, but in statistics that showing the distribution of size before and after (the independent sample t-test), any differences in size distribution will be attributed to the EVENT if the difference is statistically significant, but what if there is a difference in average values and the difference is not statistically significant? Does that mean the difference in average values is attributed to something else?
                      None of this is correct. First of all you are interpreting statistical significance in a way that is completely wrong. (I won't here give you my long rant about why you shouldn't even be thinking about statistical significance in the first place.) Suffice it to say that differences in the distribution of size before and after the event do not necessarily get attributed to the event itself, and whether the differences are "statistically significant" or not makes no difference in this regard. The differences may well represent the playing out of trends that had nothing to do with the event. If most firms were growing over the entire period of your study, then you would see a pre-post event difference that has nothing to do with the event.

                      Comment


                      • #12
                        Dear Prof. Clyde,
                        Thank you very much for your answers.
                        Now, the image is becoming more clearer than before. I am so grateful to you.

                        So, to sum up:
                        1. The results of the coefficient for Age in regression output when event= 0 is before the event.
                        2. The results of the coefficient for Event#c.Age in regression output when event= 1 is the INCREMENTAL decrease after the event.
                        3. The result in margin output for Age when event = 1 is the effect of age when event =1.
                        4. The margin output table tells me the effect of the independent variables on the dependent variables before and after the event because the margin output shows me the coefficient values after the summation when event=1. So, it better to look at the margin output if I am not interested in the differences in effect.
                        5. Most importantly, let’s assume that the dy/dx for Age when event = 0 is positive, which means increasing age is associated with an increased performance before the event. And, if the dy/dx for Age when event = 1 is negative, that means increasing age is associated with decreasing performance after the event OR we should say decreasing age is associated with increasing performance AFTER the event???
                        6. The independent sample t-test tells me the difference in distributions before and after the event and it is not necessary that the difference is attributed to the event.
                        If any of the points above are incorrect, please let me know, and I need your answer for point (5) as well, please.


                        Thank you very much.

                        Comment


                        • #13
                          All of the above points are correct.

                          Regarding 5., whether you say that increasing age is associated with decreasing performance after the event or that decreasing age is associated with increasing performance event is a matter of taste or emphasis. They mean exactly the same thing.

                          Comment


                          • #14
                            Dear Prof. Clyde,

                            I have learned from you so much, I don't know how to thank you. I will be grateful to you for the rest of my life.

                            If there is no statistical method to determine which variable that has decreased or increased (attributed to event) and that the goal of the DID method is to examine the impact on the relationship, then I think I can separate the observations for all companies before the event and conduct a multivariate regression analysis, and then separate the observations for all companies after the event and conduct a multivariate regression analysis without including the event variable in any of both models. And, the results should be qualitatively similar.

                            Do you think this makes sense or not?

                            Thank you.

                            Comment


                            • #15
                              You have described this in fairly vague general terms, so I don't really know what you have in mind here. If you showed specific code it would be possible to comment.

                              Comment

                              Working...
                              X