Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Three-way Interactions

    I am working with following panel data linear regression model: y = f(x,y,z) + other control variables. Variables x,y,z have a three -way interactions. Given the literature, I have also used the level and lagged values of these variables. As such I have 2^3 =8 interaction terms. Should I run various nested models with the control terms and choose the model with minimum AIC and BIC criteria? Actually, some of these interaction terms make no theoretical sense and I fear that such terms would imply a specification bias. Kindly advise.

    Thanks in advance.
    Last edited by Dhruv Gupta; 18 Jan 2021, 10:41.

  • #2
    General advice: variables (whether interaction terms or simple) that make no theoretical sense should not be included in the modeling. There is a rule, however, that if you include an interaction, you must also include all sub-interactions and the constituent variables. So, if a#b#c is in the model, you must also include a#b, a#c, b#c, a, b, and c. But, I would assume that if, for example, a#b doesn't make sense, then neither does a#b#c itself anyway.

    Comment


    • #3
      Thank you Clyde Schechter for your reply and clarifying my doubt.

      Comment


      • #4
        Clyde Schechter,
        Do I also understand from the advice that I do not need to run the nested models with 8 three-way interaction terms (besides their sub-interaction terms) then 7 three-way interaction terms, then 6 and so on?

        Thanks in advance again.

        Comment


        • #5
          Running nested models is only necessary if your research goals include specifically identifying the incremental effects of the added terms.

          Comment


          • #6
            Sir, thank you so much for your help and clarifying my doubt.

            Comment


            • #7
              Sir, in the context of the above-mentioned model: a = f(x,y,z) + other control variables, where variables y and z are expected to moderate the relationship between variables x and a, I wish to use positive changes x as a treatment i.e. assigning x =1 if there was an increase in x corresponding to previous year, else assign x = 0. I beg to know if this methodology preserves randomization and, if not, what precautions should I adopt? Or, should I adopt a different methodology?

              Comment


              • #8
                I beg to know if this methodology preserves randomization
                I don't understand this. What randomization was done initially that might or might not be preserved? I think you need to give a fuller explanation of the design of your study. Just seeing a very generic equation a = f(x, y, z) + other control variables does not provide nearly enough information to even start this discussion.

                Comment


                • #9
                  Sir, thanks for your reply. Due to ill health, I couldn't clarify earlier. I had used Propensity Score Matching Technique to establish causality in a paper on civil conflict. One of the reviewers to my paper had commented that I had used `increase in military personnel' as a binary treatment variable(x) i.e. assigning x =1 if there was an increase in x corresponding to previous year, else assign x = 0. According to him, this did not preserve randomization since the treatment was predefined. I checked up some standard texts and I now sense that the comment was not correct.

                  Apologies again for delaying the clarification.

                  Comment


                  • #10
                    Sorry to hear you were ill, but glad to know you have recovered.

                    Putting what you say in #9 together with what you say in #7, I take it that the variable x, increase in military personnel, was added to the model as a moderator (effect modifier) and that the original treatment was assigned randomly. In that case, you still have a randomized treatment assignment. However, this moderator is not, itself, randomly assigned, and therefore although the estimates of the effect of the original randomized treatment within levels of x are good randomized estimates, were you to look at estimates of the effect of x itself, either marginally, or within levels of the randomized treatment, these would not be randomization based estimates.

                    However, one thing still confuses me. If your treatment was randomized, what role did propensity score matching play here?

                    Comment


                    • #11
                      Sir,
                      Thank you for your wishes. I am working on predicting the Onset of Civil wars (binary dependent variable). It is modelled as a function of Rainfall, Economic Growth, Military Personnel and other control variables. Here, the focal relationship is between Military Personnel and Onset of Civil wars, and Rainfall and Economic Growth moderate it. An increase in allocation of Military Personnel vis-a-vis previous year is treated as a binary treatment variable, coded as 1 for increase in Military Personnel and 0 otherwise. As such, I have used all the covariates in a logistic regression against the treatment variable to arrive at a propensity score. In the dataset, data on Military Personnel was missing for some years. In this background, the reviewer had commented that data collection process didnot preserve randomization. On the other hand, I understand that since I played no role in missing the missed data, it was still random. Therefore, any dataset is random until it was collected in a particular manner by design.


                      Thank you again.

                      Comment


                      • #12
                        Whoa! It is hard for me to believe that allocation of military personnel was randomly assigned in your study. It seems to me that you did not have a randomized study in the first place. So you used propensity score matching to try to reduce bias--which is one reasonable way of doing so.

                        The presence of missing data on your key variable may also produce bias in analyses using that variable. You do not state what you did to deal with that, so I can't comment on that.

                        I am left with puzzlement about what the reviewer said, because this study never relied on randomization to start with. I must be missing something.

                        Comment


                        • #13
                          Sir, to deal with missing data I had not done anything extra and simply used the propensity scores to match the data. In light of your observations, I suspect that missing data might be the reason for the reviewer's comments.

                          Comment


                          • #14
                            Sir, I beg to ask another question, slightly on a different track. In the above estimation methodology, some of the interaction/main effects are positive and others are not. If I wish to separate out the overall effect and significance of a variable, say rainfall, on the Onset of Incidents, how may I do that? The literature has suggested that I use margin effects and margin plots at mean and mean+/- one standard deviation or some other justifiable value but I beg to know if there is more to it.

                            Thanking you in advance.
                            Last edited by Dhruv Gupta; 27 Feb 2021, 06:07.

                            Comment


                            • #15
                              In gneral terms, yes, you can use the -margins- command to calculate average or conditional marginal effects. But if you want more specific advice, you will need to post back with example data (use the -dataex- command to show your example data) and the exact code for your regression model. And be very clear about exactly what marginal effects you are interested in.

                              Comment

                              Working...
                              X