Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata command for three interactions variables.

    Dear Statalist member,

    Good day everyone. I am using Stata 14.1. Here is the sample of my data. I am regressing sustainability reporting score (the y) as a function of stakeholder characteristics. Stakeholder have three characteristics; pcomm, lcomm, ucomm. Comm is stand for one of the stakeholder groups in my study. I have eight stakeholder groups and I run regression model separately for each of them.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float y byte(pcomm lcomm ucomm)
      3.70833 3 2 0
        3.875 3 2 1
      4.07521 2 2 0
     3.583333 0 0 0
     3.583333 2 0 0
     3.458333 2 1 0
    1.5833334 1 0 0
    1.5833334 1 0 0
    1.6458334 1 0 0
       1.9375 0 1 0
       2.0625 0 1 0
      3.04816 0 1 0
     3.291667 0 0 0
     3.458333 1 1 0
     3.753788 0 0 0
         2.25 0 0 0
    3.0492425 0 0 0
     3.257576 3 0 0
            3 1 0 0
    2.0416667 1 0 0
     5.017045 3 2 0
       2.1875 2 1 0
       2.1875 3 1 1
       2.1875 3 1 1
         3.25 1 0 0
     2.416667 1 0 0
    1.9791666 2 0 0
    2.2916667 2 1 0
    2.2916667 0 0 0
     2.666667 3 1 0
    1.7291666 2 1 0
    1.5833334 0 0 0
          1.5 0 0 0
    2.2083333 2 1 0
     2.708333 2 1 0
    3.2367425 2 1 0
        3.625 0 0 0
    3.1742425 3 1 0
     3.666667 1 1 0
     2.729167 0 0 0
     2.395833 0 0 0
       2.4375 0 0 0
     3.291667 1 1 1
       3.4375 1 1 0
     3.090909 1 0 0
     2.520833 0 1 0
    2.0416667 0 1 0
    1.9791666 0 1 0
    1.8541666 2 1 0
       2.0625 2 1 0
     2.278409 0 0 0
     2.583333 3 1 0
     2.479167 0 1 0
     2.916667 3 1 0
       2.8125 1 2 0
     2.973485 1 1 0
     3.132576 0 0 0
    2.0833333 3 1 0
     2.833333 3 1 0
    3.8733766 2 1 0
    1.5833334 0 1 0
    2.3541667 2 1 0
     2.479167 2 1 0
     3.541667 3 1 0
          3.5 3 1 0
     3.715909 2 1 1
    1.2291666 2 1 0
    1.2291666 2 1 0
    1.2708334 2 1 0
        3.375 3 2 1
     3.458333 2 1 0
     4.378247 3 1 1
     2.416667 3 1 1
    2.2916667 3 2 0
     2.666667 3 2 0
     3.257576 2 1 0
     2.666667 2 1 0
     2.416667 2 1 0
     2.458333 0 0 0
     3.208333 3 2 0
    2.8511906 3 1 0
     2.916667 1 1 0
     2.916667 1 1 0
    2.2916667 1 1 0
    3.6079545 2 1 0
       2.0625 2 0 0
      3.53355 1 1 0
        1.875 3 1 0
         2.75 3 1 0
     2.916667 2 1 0
    1.8238636 1 1 0
    2.0738637 1 1 0
    1.8238636 1 1 0
     2.458333 0 1 0
     2.458333 0 1 0
     3.104167 3 1 0
    3.5189395 1 1 0
    3.4564395 1 1 0
     3.874459 2 2 0
     3.604167 2 1 0
    end
    According to theory, the y is also associated with cumulative stakeholder characteristics. Meaning to say that the theory propose that having more than one characteristics can have an effect on y. At first I run the following command,
    Code:
    xtreg y pcomm lcomm ucomm c.pcomm#c.lcomm#c.ucomm, fe vce (robust)
    end
    But, I could not find the way to interpret the result. Instead, I find many discussion refering to three factorial effects. Something like this command.
    Code:
    xtreg y pcomm##lcomm##ucomm, fe vce (robust)
    end
    Now, I am confuse which of the two is the right command that should be use to answer the theory.

    I also not sure whether the interaction variable should be a factor variable or continous variable. Thanks you for your time.

  • #2
    Your first code, using # is mis-specified because you have a three level interaction term but fail to include the two-level sub-interactions. When you use ##, as in the second code, Stata automatically generates those, which is the right way to go.

    There is another difference between the two codes, and I cannot tell you which is correct in this respect: the first code specifies pcomm, lcomm, and ucomm as continuous variables, whereas the second code (implicitly) specifies them as discrete variables. In the example data, these variables look like they might be discrete, but I don't even know what they are. And since you took the trouble to add c. in the first code, it suggests to me that you may think of them as continuous. Anyway, if they are continuous, you need to put c. in front of each of those in the interaction. If they are discrete, leave it as is.

    Comment


    • #3
      Dear Prof Clyde,

      Thanks for your reply.

      Your first code, using # is mis-specified because you have a three level interaction term but fail to include the two-level sub-interactions. When you use ##, as in the second code, Stata automatically generates those, which is the right way to go.
      What I am thinking is that since I am only interested to know if having more than one stakeholder's characteristics has an effect on the y, and not intent to know every possible two level interactions, so that is the reasons I am using # instead of ##. Is this right? (sorry to ask again)

      There is another difference between the two codes, and I cannot tell you which is correct in this respect: the first code specifies pcomm, lcomm, and ucomm as continuous variables, whereas the second code (implicitly) specifies them as discrete variables. In the example data, these variables look like they might be discrete, but I don't even know what they are. And since you took the trouble to add c. in the first code, it suggests to me that you may think of them as continuous. Anyway, if they are continuous, you need to put c. in front of each of those in the interaction. If they are discrete, leave it as is.
      pcomm lcomm and ucomm is an ordinal variable. For example, pcomm is measuring power for community stakeholder. pcomm range from 0 (no power) to 3(high power). Each of lcomm and ucomm range from 0 to 2. I can also add them all and create a new variable as the total of community characteristics.

      Comment


      • #4
        What I am thinking is that since I am only interested to know if having more than one stakeholder's characteristics has an effect on the y, and not intent to know every possible two level interactions, so that is the reasons I am using # instead of ##. Is this right? (sorry to ask again)

        No, it's not right because without the two-way interactions in the model, the coefficient you get for the three-way interaction term is wrong. If you are not interested in the two-way interactions, you can ignore them when you read the results, but they have to be in the model for the results to be correct.

        pcomm lcomm and ucomm is an ordinal variable. For example, pcomm is measuring power for community stakeholder. pcomm range from 0 (no power) to 3(high power). Each of lcomm and ucomm range from 0 to 2. I can also add them all and create a new variable as the total of community characteristics.
        Well, ordinal variables are an in-between case and sometimes it is best to treat them as discrete, and sometimes it is best to treat them as continuous. The fact that you are comdfortable with adding them, however, tells me that in your context, they are suitable for treatment as continuous variables.

        Comment

        Working...
        X