Interpretation of Dummy variables and their interactions with Continuous variables (Urgent help required)

Michael Bond

Join Date: Nov 2015
Posts: 45

Interpretation of Dummy variables and their interactions with Continuous variables (Urgent help required)

28 Sep 2017, 13:55

Dear All,

I have a question concerning the dummy and dummy interactions. Age is a continuous variable (variable of interest) and while Plan 1, Plan 2 and Plan 3 are dummy plans (reference category, zero plans). While, interaction terms are Plan 1*Age, Plan 2*Age, and Plan 3*Age

Code:

 Log of expense (Dependent Vairabe)
Model 1
Model 2
Model 3

AGE
0.1231***
0.123
0.105**

Plan 1 (Dummy)
0.923**
0.837**
0.637**

Plan 2 (Dummy)
1.388***
1.032**
0.932**

Plan 3 (Dummy)
2.622**
2.905***
2.123***

Plan 1*Age

0.11**
0.11**

PLan 2*Age

0.228*
0.198*

Plan 3* Age

0.232**
0.201**

Control Variable

If you can observe, Model 2, age become insignificant in comparison ro model 1. I know they are two different models since it includes interaction terms. Now, can we say age effects expense?

Can we still say in model 2 (even with the inclusion of interaction terms) that presence of plans lead to 0.837% higher expense than firms with no plan? The higher the plans the greater the expense.

While in third model age is significant again, please help me as I can not find any link where there is best expressed with three dummy variables as well as interaction terms. Please assist me with this as I have a thesis to due tomorrow?

Regards,
Michael

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30111
#2

28 Sep 2017, 14:03

Showing output without the commands that generated it, and showing output that has been laundered through a pretty-print program and has lost many details is not helpful. Please repost showing the actual commands for each model, and the actual output of the regression itself. To be sure you do not inadvertently omit or modify any details (all details are important) do this by copy/pasting from your log file or the Results window--do not retype, and do not edit. Place the whole thing between code delimiters. (If you are not familiar with code delimiters, read FAQ #12.)
Comment
Michael Bond

Join Date: Nov 2015

Posts: 45
#3

28 Sep 2017, 14:13

For Model 1

Code:

regress c age plan1 plan2 plan3 tenure firmsize

For Model 2

Code:

regress c age plan1 plan2 plan3 i.plan1#c.age i.plan2#c.age i.plan3#c.age tenure firmsize

For Model 3

Code:

regress c age plan1 plan2 plan3 i.plan1#c.age i.plan2#c.age i.plan3#c.age tenure firmsize boardsize

In Model 3, there is an additional control variable. I did add code delimiters and they are the same results as from stata please assist
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30111
#4

28 Sep 2017, 14:14

You did not show the regression results as requested. There is more information in them besides the coefficients and the "significance stars." Some of that information may be relevant here.
Comment
Michael Bond

Join Date: Nov 2015

Posts: 45
#5

28 Sep 2017, 14:16

I could not retrieve those results now as I am at home and stata works only in the library. But, I did add stars for the significance
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30111
#6

28 Sep 2017, 14:36

Well, in the absence of complete information, this advice may prove incorrect, but here are some general thoughts.

If you can observe, Model 2, age become insignificant in comparison ro model 1. I know they are two different models since it includes interaction terms. Now, can we say age effects expense?

In the interaction models, the coefficient of age does not mean the same thing as it means in Model 1. In Model 1, the coefficient is an average effect of age that applies in all observations, regardless of the values of the plan* variables. By contrast, in Models 2 and 3, the coefficient of age is a specific effect of age that applies only to observations where plan1, plan2, and plan3 are all zero. So there is no reason to expect them to be similar or have similar significance levels. They can have opposite signs; really, any kind of difference is to be expected.

As for assessing "whether age has an effect" in Models 2 and 3, the literal answer is, without even looking at the coefficients, no. In those models age has 8 different effects, one for each of the 8 possible combinations of Plan 1 (1 or 0), Plan 2 (1 or 0) and Plan 3 (1 or 0). There is an average effect of age, which is easily calculated with the -margins- command (though you would have to re-run the regressions using factor variable notation properly first). If you just want a null hypothesis significance test (ugh!), the effects of age are split up over age, age#plan1, age#plan2, and age#plan3, so you have to test them jointly:

Code:

test age 1.plan1#c.age 1.plan2#c.age 1.plan3#c.age

Can we still say in model 2 (even with the inclusion of interaction terms) that presence of plans lead to 0.837% higher expense than firms with no plan?

No. That is only true when age = 0, and only for plan 1. When plan1 is 1, at any other age, the expected difference in ln expense, compared to its value at the same age when plan1 is 0, is 0.837 + 0.11*age. This in turn means that expense itself is multiplied by a factor of exp(0.837+0.11*age). You could then subtract 1 from that and multiply by 100 to get the percentage difference in expense, at that particular age, when plan1 is 1 instead of 0. In any case, it differs by age. In fact, the age component of this quickly dominates the calculation compared to the 0.837 term. Note, by the way, that here you should not rely on the rule-of-thumb that you multiply the coefficient by 100 to get the percentage change. That rule-of-thumb relies on a Taylor series approximation for logarithm being truncated at the linear term. That approximation gets bad quickly as the coefficient grows. Here, when you plug in any appreciable value of age you will have 0.837+0.11*age as a large number way beyond the range of validity of that approximation. So you actually need to do this exponential calculation.

The higher the plans the greater the expense.

What does this even mean? Each plan variable is either 0 or 1, right?

Again, I want to emphasize that there may be additional considerations based on other aspects of the regression output that would modify these interpretations.
Comment
Michael Bond

Join Date: Nov 2015

Posts: 45
#7

28 Sep 2017, 15:48

The higher the plans the greater the expense.

Well, I meant looking at model 2, can I say, by analyzing the plan1, plan2 and plan3, the coefficient on plan 3>plan2> plan 3, relative to the base category. If I can not interpret the coefficients at least I want to say, a greater number of plan leads to greater expense. If I am just interested in presenting the second model

,*Interesting question is, if firms use no plans (Referece category) then can we say effect of age on compensation is exp(0.1231)-1= or do we look at the coefficient of control and constants as well

If i do present the first model

However, in Model 1, at least can I say that I could say firms which grant one plan gives exp (0.923)-1 = 151% higher than firms which give no plan. Since there are no interaction terms
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30111
#8

28 Sep 2017, 16:23

Well, I meant looking at model 2, can I say, by analyzing the plan1, plan2 and plan3, the coefficient on plan 3>plan2> plan 3, relative to the base category. If I can not interpret the coefficients at least I want to say, a greater number of plan leads to greater expense.

Well, this conclusion appears correct, but not for the reason you give. It is critically important here that the interaction terms also order themselves with plan3#age > plan2#age > plan1#age. If that were not true, then which plan led to the greatest expense would vary with age.

Interesting question is, if firms use no plans (Referece category) then can we say effect of age on compensation is exp(0.1231)-1= or do we look at the coefficient of control and constants as well

No reason to look at those other coefficients. The entire effect of age is captured in the age variable and any interaction term that includes age is a constituent. Nothing else matters.

However, in Model 1, at least can I say that I could say firms which grant one plan gives exp (0.923)-1 = 151% higher than firms which give no plan. Since there are no interaction terms

Yes. Well, my calculation says 152%, but the formula is the correct one.
Comment
Michael Bond

Join Date: Nov 2015

Posts: 45
#9

28 Sep 2017, 18:06

Many thanks for getting back. You have been savior.

Well, this conclusion appears correct, but not for the reason you give. It is critically important here that the interaction terms also order themselves with plan3#age > plan2#age > plan1#age. If that were not true, then which plan led to the greatest expense would vary with age.

All you are saying that I could only say that greater plans lead to greater expense not because of just dummy plans but also because of the interaction. then, what is the role of dummies (1,2,3), so when interaction terms are added, the dummies itself do not tell much, right?

But, Am i saying right, that firms which use no plans, the effect on age is captured through exp(0.1231)-1, is it the right statement. Basically, we could say unlike model 1, model 2 tells the effect of the age where plan 1, plan 2 or plan 3 =0 or in other words where firms do not choose any plans.

At least in model 1, I can say this statement that greater the plans, greater the expense since there are no interaction terms.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30111
#10

28 Sep 2017, 18:19

All you are saying that I could only say that greater plans lead to greater expense not because of just dummy plans but also because of the interaction. then, what is the role of dummies (1,2,3), so when interaction terms are added, the dummies itself do not tell much, right?

The plan coefficients by themselves give only the effect of that plan vs no plan when age = 0. Whether you consider that much or not is up to you.

But, Am i saying right, that firms which use no plans, the effect on age is captured through exp(0.1231)-1, is it the right statement.

exp(0.1231)-1 is the semi-elasticity in model 1. The effect is 0.1231 itself, on log expenses. exp(0.1231)-1 is the proportional increase in expenses itself. But it is not called an effect, it is a semi-elasticity.

Basically, we could say unlike model 1, model 2 tells the effect of the age where plan 1, plan 2 or plan 3 =0 or in other words where firms do not choose any plans.

Well that's what the coefficient of age by itself gives you. The model also gives you the effects of age in the presence of each plan or any combination of the plans, but not shown in the regression output--you would have to calculate them from the coefficients appropriately. That is best done with the -margins- command to avoid making mistakes (and so you would also get standard errors), but for that you would have to re-do your regressions using factor-variable notation correctly. So perhaps you have no desire to go there.

At least in model 1, I can say this statement that greater the plans, greater the expense since there are no interaction terms.

Since there are no interaction terms and since the coefficients are ordered plan 3 > plan 2 > plan 1.
Comment
Michael Bond

Join Date: Nov 2015

Posts: 45
#11

28 Sep 2017, 18:22

Lastly, for dummy plan 3, can i say that 1 unit increase in age leads to increase of exponential^ (.232+2.905)-1 *100 in total expense. However, I could also say in plan 1, 10% percentage point increase in age leads to (.232+2.905)*10 increase in total expense. This is the last thing, no further questions asked
Comment
Michael Bond

Join Date: Nov 2015

Posts: 45
#12

28 Sep 2017, 18:38

Class answers. I have a question but I wrote my last question before even getting any answer, simple and easy question.

The plan coefficients by themselves give only the effect of that plan vs no plan when age = 0. Whether you consider that much or not is up to you.

In secont model, thus, we can not claim that firms which give 1 plan have higher expense by exp(0.837-1)*100 than firms which offer no plans. This is wrong interpretation, due to interactions terms and we can not claim this and happens only when age is zero (whis is impossible)

However, is that the reason, we can claim that in Model 1 that firms which grant one plan give 152% greater than firms which give no plans. (Here does it apply the case when age=o) or is it because we do not interaction terms, we do not need to so here age=0). That's it Sir.
Comment
Michael Bond

Join Date: Nov 2015

Posts: 45
#13

28 Sep 2017, 18:53

Please assist me with the last question as I have to submit this in morning. I won't ask further questions......1
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30111
#14

28 Sep 2017, 19:30

In secont model, thus, we can not claim that firms which give 1 plan have higher expense by exp(0.837-1)*100 than firms which offer no plans. This is wrong interpretation, due to interactions terms and we can not claim this and happens only when age is zero (whis is impossible)

However, is that the reason, we can claim that in Model 1 that firms which grant one plan give 152% greater than firms which give no plans. (Here does it apply the case when age=o) or is it because we do not interaction terms, we do not need to so here age=0).

It's the latter. Because Model 1 has no interaction terms, the Plan 1 coefficient gives an effect which applies regardless of age.

Lastly, for dummy plan 3, can i say that 1 unit increase in age leads to increase of exponential^ (.232+2.905)-1 *100 in total expense. However, I could also say in plan 1, 10% percentage point increase in age leads to (.232+2.905)*10 increase in total expense. This is the last thing, no further questions asked

No, both of these are incorrect. You need to work through the algebra here:

Code:

At age A in model 2, the predicted log expense with Plan 3 is: ln expense = 2.905 + 0.232*A + other terms At age A+1 it becomes ln expense = 2.905 + 0.232*(A+1) + the same other terms Subtracting: difference in ln expense = 0.232*(A+1-A) = 0.232*1 = 0.232 Therefore: expense is increased by a factor of exp(0.232) = 1.26 (to two places) which corresponds to a 26% increase.

The expression exponential^ (.232+2.905)-1 *100 that you proposed would actually be the percentage difference in expenses between Plan 3 at age 1 and no plan at age 0.
1 like
Comment
Michael Bond

Join Date: Nov 2015

Posts: 45
#15

28 Sep 2017, 19:51

Thanks a lot. I think we can also say this about plan 3 relationship (plan3*age+age) since age is insignificant, we can say 10% increase in age leads to 2.32% increase in expense within plans 3. no questions for now. thanks a lot
Comment

Announcement

Interpretation of Dummy variables and their interactions with Continuous variables (Urgent help required)

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment