Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interaction Using Single Hashtag (#)

    Dear Experts,

    My objective is to get two coefficients with corresponding p-values before and after a reform took place. My data starts in 2005 and ends in 2015 for 125 firms. Therefore, it is a panel data but unbalanced. My dependent variable is RiskLevel (continuous) and IVs are income (continuous) and size (dummy 0,1). If we want to design a fixed effect model for the full sample, it could be:
    Code:
    xtreg RiskLevel income size, fe
    However, a reform took place in 2012. And, I want to compare between the pre-reform (2005 to 2010) and post-reform era (2011 to rest). Therefore, I generated a dummy variable, Reform that equals 0 indicating before reform and 1 equals after. Now, as I want to compare between two eras for the IVs, can I interact single hashtag like the following:
    Code:
    xtreg RiskLevel i.Reform#c.income i.Reform#i.size, fe
    After using single hashtag, I get two values (coefficients, p-value etc.) for pre- and post-reform. My question is, whether I can use '#' in my equation, or should I use '##' instead. Does a single hashtag depict before-after scenario in this instance for comparison?

    I also tried using the equation like the following with ## instead:
    Code:
    xtreg RiskLevel i.Reform##c.income i.Reform##i.size, fe
    However, it returns 3 values for each of the IVs. For example, in the case of income, 1 value for Reform and 1 value for income (for full sample) and another one is marginal effect like (1.Reform#income). As my initial objective is to get values (Coefficients, standard errors, t-values and p-values) for both pre- and post-reform (not the marginal effect only), what should I do now?

    Thanks in advance.

  • #2
    I don't fully understand your data. Let's go back to the start. You need a data set in which some entities underwent a reform in 2012, and others never underwent the reform. This data set must include a variable, treated, which is 1 for those that did undergo the reform, and 0 in the others. In addition, you need a variable which is 1 in all observations representing year 2012 and after, 0 before 2012 (regardless of whether the entity underwent the reform. Let's call that variable pre_post.

    Once you have that, you have the makings of a basic difference-in-differences analysis:
    Code:
    xtset entity
    regress RiskLevel i.treated##i.pre_post, fe
    The coefficient of 1.treated#1.pre_post will be the diff-in-diff estimator of the effect of the reform. You may also want to include cluster robust vce to the model, and there may be covariates worth including. Note that in this model, and the others below, Stata will omit the variable treated due to colinearity with the fixed effect. This is normal, and expected--don't give it a moment's thought. In fact, if it doesn't do that, then there is an error in your data that you must fix!

    If you are also interested in whether the size of the entity modifies the effect of the reform (i.e. the effect of the reform is different in small entities from its effect in large firms), then you can expand the model to include a three-way interaction:
    Code:
    regress RiskLevel i.treated##i.pre_post##i.size, fe
    However, that approach is difficult to interpret because you probably will want to see the effect itself in both small and large entities. So for this it is convenient to create a new variable to represent the treated#pre_post interaction itself and interact it with size:
    Code:
    gen effect = 1.treated#1.pre_post
    regress RiskLevel i.treated i.pre_post i.effect##i.size, fe
    margins size, dydx(effect) noestimcheck



    Comment


    • #3
      Dear Clyde,

      Many thanks for your kind reply.

      No, I just want to determine the cofficients for income and size interms of RiskLevel for before reform and after reform. All the firms before reform i.e. 2012 had in similar state. Then, a reform took place. Therefore, there was no variation within the firms before reform took place. I just want to get two coefficients for each of the IVs - one before reform and one after. And from different sources, I have come to know that if I use single hashtag like the following can give me two coefficients for before and after reform by using a single equation (Reform is an indicator variable: 0 for before 2012 and 1 for after 2012).
      Code:
      xtreg RiskLevel i.Reform#c.income i.Reform#i.size, fe
      I have also found that its better to use ## as it is better than using # like the following:
      Code:
      xtreg RiskLevel i.Reform##c.income i.Reform##i.size, fe
      However, it shows the marginal effect and the value for full-sample of the variables. But I want two seperate coefficients (original effect) for both pre- and post-reform. That's why I wanted to use #. But I do not know whether it is suitable in this instance or not. Please note, I have also run another full-sample model spanning between 2005 and 2015. Now, I want check the before and after effects (coefficients and p-values) for all the firms before 2012 and after 2012.

      Looking forward to your kind advice.

      Thanks.

      Comment


      • #4
        I'm sorry. I'm having a hard time following you.

        Therefore, there was no variation within the firms before reform took place.
        Then the coefficients for before reform are zero, you don't need any regression model to tell you that. But this also seems highly implausible in the first place. So I don't know what you're trying to say here.

        It sounds like your Reform variable is the same as the variable I was calling pre_post.

        In the model with only single hashtags, you are calculating two separate slopes, but they are not correct slopes. They are slopes fitted to a model which imposes the constraint that at the moment of the reform in 2012 there is no "jump" in Risk Level associated with the reform, that the only possible change is that the slope of risk level vs income or size changes. In calculus terms, you allow for the graph of risk level vs income to have a corner in 2012, but no discontinuity.

        In the model with double hashtags, you allow for the possibility not only that the slope changes, but that there is a discrete jump as well. In calculus terms, this model allows for a discontinuity as well as a change in direction.

        If there is good theoretical reason to believe that the reform in question induces no jump, only a change in direction of the risk:income and risk:size relationships, then the # model is valid. But that is a very strong assumption, and one that you should not make in the absence of strong theory or good prior evidence. Absent that justification, you should use the less constrained ## model.

        From the ## model it is easy enough to get the pre and post reform slopes. The pre-reform slope is the slope on the row labeled income (or 1.size). The post-reform slope is gotten by adding the slope for the interaction term to that. The simplest way to do that is to run
        Code:
        lincom income + 1.Reform#c.income
        lincom size + 1.Reform#1.size


        Comment


        • #5
          Dear Clyde,

          Wow! That sounds great! Now, I get to know that the slopes of variables (along with p-values and std errors) for the pre-reform term are autogenerated. Moreover, you have also given the command for finding coefficients after the reform. But how can I get corresponding p-values, t-values and std errors of those slopes for the post-reform era?

          Thanks for your valuable pieces of advice in this regard.

          Comment


          • #6
            The standard errors, t-statistics, pvalues, and confidence intervals are all part of the -lincom- output. The output of -lincom- is really just like a line in a Stata regression output table.

            Comment


            • #7
              Dear Clyde,

              Again thanks for your to-the-point solution for the coefficients of pre- and post-reform. I have also tried with the -lincom- command that you wrote. For continuous variable (income), it works fine. However, for dummy variable (size), it was generating a message like "regressor size not found". After that, I have run the following equation:
              Code:
              lincom 1.size + 1.Reform#1.size
              And, it worked! Do you think that its correct to use '1.size' just after -lincom- while generating the coefficient for 'size' after the reform?

              On a different note, the coefficients, t-values, SE and p-values for both the continuous and indicator variables remain almost identical to the values that I get by using # and that I find by using ## followed by -lincom- to get pre- and post-reform coefficients. How do you think about this?

              Many thanks for your kind support.

              Comment


              • #8
                Do you think that its correct to use '1.size' just after -lincom- while generating the coefficient for 'size' after the reform?
                I'm not sure what you're asking here. If you are wondering whether the syntax shown in your code block just above that is correct, yes, it is.

                On a different note, the coefficients, t-values, SE and p-values for both the continuous and indicator variables remain almost identical to the values that I get by using # and that I find by using ## followed by -lincom- to get pre- and post-reform coefficients. How do you think about this?
                Well this relates to the explanation I gave in #4 about the difference between the # and ## versions of the model. That the results come out nearly the same implies that the discontinuity ("jump") in outcome associated with Reform is probably quite close to zero.

                Comment


                • #9
                  I am grateful for your kind explanations.

                  I have alao tried for a model without interacting Reform with the IVs, and divided the whole dataset into two different sub-sets, from 2007 to 2011 and 2013 to 2017 as the reform took place in 2012. For instance, I ran the following command for pre-reform regime:
                  ​​​​​​
                  Code:
                  xtreg RiskLevel income size, fe, if Year<=2011​​​​​​
                  And for post-reform regime:
                  Code:
                  xtreg RiskLevel income size, fe, if Year>=2013
                  The results relating to these models and that of # or ## differs both in terms of magnitude and significance. In the given circumstances, I am a bit perplexed whether I will use interaction term in my model or not. Could you please explain in brief why this happens?

                  Comment


                  • #10
                    There are several things going on here. Some of them are peculiar to your particular approach, but most of them are generic.

                    One peculiar issue is that your separate analyses shown in #9 exclude year 2012, whereas the # and ## analyses include year 2012.

                    But let's turn to the more important generic issues.

                    First, comparing statistical significance from one model to another, even when the models are sufficiently comparable in other respects, is a hazardous adventure at best. In this instance, you are comparing a model based on a large combined sample with two models each based on samples about half that size. The sample size alone has a major impact on statistical significance, even if the findings of the models are otherwise the same. So looking at these things in terms of what is and is not statistically significant is useless in any case. (n fact, the American Statistical Association has recently issued a position paper recommending that the concept of statistical significance be abandoned. Read https://www.tandfonline.com/doi/full...5.2019.1583913
                    and the papers referenced therein if you have the time. If you have only a little time, a quick pep talk on the topic is available at https://www.nature.com/articles/d41586-019-00857-9. But even if you still want to use the concept of statistical significance, this application of it is not valid anyway.

                    But the most important point is that you are comparing apples to oranges. The coefficient of income or size in one of the models in #9 is estimating a different parameter from the coefficient of income in the ## models, so there is no reason to expect them to be similar to each other. Remember, for example, that the marginal effect of income or size in the post-Reform era does not even appear in the output of the ## model. To find that, you have to use -lincom- (or -margins, dydx()-). So the coefficient of income in the model conditioned on -if Year >= 2012- would not be found anywhere in the output of the interaction model.

                    In addition, you are working with fixed-effects models, so that the impact of the fixed effects is different in the combined ## model and the separate pre and post Reform models. These further cause the estimates to diverge from each other. If you were to do these analyses with just -reg-, and include no covariates, you would find comparability between the separate model coefficients and the ## output (including the output from -lincom-).

                    So for all of these reasons, your expectation that these results will be the same, or even close, is mistaken.

                    Comment


                    • #11
                      That was fabulous! A nice detailed one. Don't you think that it is more justifiable to incorporate reform (using interaction) in my case as a reform was there?

                      In addiyion to this, as you replied:
                      Remember, for example, that the marginal effect of income or size in the post-Reform era does not even appear in the output of the ## model
                      Then how could I find the marginal effect of post-reform era in the ## model using both -lincom- and -margins-? And again, if the marginal effect is not significant, what does it indicates?

                      Comment


                      • #12
                        Yes, in general, I incline towards using interaction models for this kind of thing.

                        Then how could I find the marginal effect of post-reform era in the ## model using both -lincom- and -margins-?
                        I think you misunderstood my point in what you quoted. The marginal effect in the post-Reform group does not appear directly in the regression output of the ## model. But -lincom-, as illustrated in #7, then gives it to you.

                        And again, if the marginal effect is not significant, what does it indicates?
                        Well, as I strongly endorse the new American Statistical Association position that the concept of statistical significance is misleading and should be abandoned, I am inclined to tell you that it doesn't mean anything and you should just stop thinking about it. But old habits die hard, and you are probably surrounded by people who haven't "gotten the memo" yet and will press you on the matter. So even if you want to take statistical significance seriously, all it really ever meant, before people started wildly misinterpreting it, is that if there were no effect at all of the Reform (which is in most situations completely implausible to start with) then the probability of a random sample of data producing results similar to the ones you got is at least 5%. So it says that the results you got are compatible with what you might get once in a while from a particular straw-man random number generator.

                        From a slightly more practical perspective, it would be reasonable to interpret a not statistically significant finding as meaning that the quantity and quality (i.e. noisiness) of the data are such that it is not possible to confidently determine the direction of the effect, nor exclude the possibility that there is no effect at all.

                        Comment


                        • #13
                          Many thanks, Clyde. Your replies in this regard will certainly help me a lot.

                          Comment

                          Working...
                          X