square independent variable in fixed effect regression

Faisal Abdullah

Join Date: Mar 2015

Posts: 44
#1

square independent variable in fixed effect regression

04 Nov 2015, 07:00

Hello everyone,

I am using a square variable in my regression as independent variable. my model is as the following:

Leverage = cash^2 etc..

when I add a square variable of cash, Do I have also do include both Cash and Cash^2. or it is find to only include Cash^2. This is because when I add both of them, the significance of the results disappear.

many thanks and regards for your suggestions
Tags: None
Rich Goldstein

Join Date: Mar 2014

Posts: 4455
#2

04 Nov 2015, 07:06

how are you going to interpret the result unless you include "Cash"? you might want to read Nelder, JA (1998), "The selection of terms in response-surface models - how strong is the weak heredity principle?" The American Statistician, 52(4): 315-318
1 like
Comment
Faisal Abdullah

Join Date: Mar 2015

Posts: 44
#3

04 Nov 2015, 08:21

Hi Rich, Thank you for your comment.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30060
#4

04 Nov 2015, 08:25

Also, if you run a model including cash and leverage = cash^2, you should not judge the effect of cash by just looking separately at the coefficients of cash and leverage, you should look at their joint significance with

Code:

test cash leverage

It is possible that the joint test will be significant even though neither cash nor leverage by itself tests as statistically significant.

The other thing to bear in mind is that centering matters in this situation. If you re-center cash, the coefficient of cash and of the constant term (though not of leverage) will change. With different choices of centering, you can get the coefficient of cash to be almost anything you want. So the significance of the cash coefficient in this kind of model is not meaningful. Here's an example:

Code:

sysuse auto, clear regress price c.mpg##c.mpg // REGRESS PRICE ON mpg AND mpg^2 // NOTE LARGE, SIGNIFICANT NEGATIVE COEFFICIENT OF MPG // APPROPRIATE TEST IS JOINT TEST OF MPG AND ITS SQUARE test mpg mpg#mpg // NOW DO IT WITH MPG CENTERED AT 30 gen mpg_c = mpg - 30 regress price c.mpg_c##c.mpg_c // NOTE THAT COEF OF MPG IS NOW A SMALL POSITIVE // NUBMER AND IS NOT SIGNIFICANT // BUT THE JOINT TEST IS UNCHANGED test mpg_c mpg_c#mpg_c
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17699
#5

04 Nov 2015, 10:33

Faisal:
posting what you tyoed and what Stata gave you back woud let others on the list to see what yu mean by

...the significance of the results disappear...

(by the way, that finding may be as meaningful as a significant one).

Kind regards,
Carlo
(Stata 19.0)
Comment
wanhaiyou

Join Date: May 2014

Posts: 130
#6

04 Nov 2015, 18:44

Originally posted by Clyde Schechter View Post

Also, if you run a model including cash and leverage = cash^2, you should not judge the effect of cash by just looking separately at the coefficients of cash and leverage, you should look at their joint significance with

Code:

test cash leverage

It is possible that the joint test will be significant even though neither cash nor leverage by itself tests as statistically significant.

The other thing to bear in mind is that centering matters in this situation. If you re-center cash, the coefficient of cash and of the constant term (though not of leverage) will change. With different choices of centering, you can get the coefficient of cash to be almost anything you want. So the significance of the cash coefficient in this kind of model is not meaningful. Here's an example:

Code:

sysuse auto, clear regress price c.mpg##c.mpg // REGRESS PRICE ON mpg AND mpg^2 // NOTE LARGE, SIGNIFICANT NEGATIVE COEFFICIENT OF MPG // APPROPRIATE TEST IS JOINT TEST OF MPG AND ITS SQUARE test mpg mpg#mpg // NOW DO IT WITH MPG CENTERED AT 30 gen mpg_c = mpg - 30 regress price c.mpg_c##c.mpg_c // NOTE THAT COEF OF MPG IS NOW A SMALL POSITIVE // NUBMER AND IS NOT SIGNIFICANT // BUT THE JOINT TEST IS UNCHANGED test mpg_c mpg_c#mpg_c

Hi Clyde,
I also had met the problem like this.Thanks for your valuable suggestions.If the joint test is significant, how can we interpret this result?
We can say this result indicates that cash has significant role on dependent variable? or others?
For example, if the coefficient of cash is signficant and cash^2 (the coefficient of cash^2 is negative) is not significant,but the joint test is significant, how can we interpretthe result? (As I know, if the cash and cash^2 (the coefficient of cash^2 is negative) are all significant,we can say there exist inverted-U relationship between cash and dependent variable).
Thanks for your clarify!

Bests regards,
wanhaiyou

Last edited by wanhaiyou; 04 Nov 2015, 19:30.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30060
#7

04 Nov 2015, 21:24

When you are modeling with a quadratic term, you are fitting a U-shaped, or upside down U-shaped relationship to the data. The sign of the coefficient of the quadratic term determines whether the U is upside down (negative) or right side up (positive). If the quadratic term coefficient is close to zero, and if it is not significant, then that suggests that the relationship is rather flat (linear), and it might be best to just re-run the model without the quadratic term.

Assuming that the quadratic term coefficient is significant, generally I recommend reporting the joint significance test for cash and cash^2, stating whether it is a U or an upside-down U shaped relationship, and also identifying where the nadir (or peak) of the U (or upside-down U) is located. The location of the nadir (or peak) is at cash = -_b[cash]/(2*_b[cash#cash]), which you can evaluate with the -nlcom- command. Whether the coefficient of cash by itself is statistically significant is 100% irrelevant and, as shown in my illustration about recentering,100% meaningless--you shouldn't even look at that.
Comment
wanhaiyou

Join Date: May 2014

Posts: 130
#8

04 Nov 2015, 23:42

Originally posted by Clyde Schechter View Post

When you are modeling with a quadratic term, you are fitting a U-shaped, or upside down U-shaped relationship to the data. The sign of the coefficient of the quadratic term determines whether the U is upside down (negative) or right side up (positive). If the quadratic term coefficient is close to zero, and if it is not significant, then that suggests that the relationship is rather flat (linear), and it might be best to just re-run the model without the quadratic term.

Assuming that the quadratic term coefficient is significant, generally I recommend reporting the joint significance test for cash and cash^2, stating whether it is a U or an upside-down U shaped relationship, and also identifying where the nadir (or peak) of the U (or upside-down U) is located. The location of the nadir (or peak) is at cash = -_b[cash]/(2*_b[cash#cash]), which you can evaluate with the -nlcom- command. Whether the coefficient of cash by itself is statistically significant is 100% irrelevant and, as shown in my illustration about recentering,100% meaningless--you shouldn't even look at that.

Thanks very much for your kindly reply. I have another question. If the interaction term (e.g. cash*cash) is substantively or statistically significant,but the cash term is not significant,
how should I interpret this result?

Thanks for your help.

Kindly regards,
wanhaiyou
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17699
#9

05 Nov 2015, 00:45

wanhaiyou:
your query has been already addressed in Clyde's superb reply (see above),
In a nutshell, if the quadratic term is statistically significant the relationship between there's enough evidence that the relationship beteween variables is quadratic (no matter the statistical sugnificance of the linear term).
If the qaudratic term is not statistically significant, you might think of ruling out the 2nd order term and stick with the linear one only. In my opinion, you might also want to keep a non-significant qudratic term if you suspect that its lack of statistical significance has more to do with the limited sampe size under investigation than to the absence of a real non-linear relationship.

Kind regards,
Carlo
(Stata 19.0)
Comment
wanhaiyou

Join Date: May 2014

Posts: 130
#10

05 Nov 2015, 01:00

Originally posted by Carlo Lazzaro View Post

wanhaiyou:
your query has been already addressed in Clyde's superb reply (see above),
In a nutshell, if the quadratic term is statistically significant the relationship between there's enough evidence that the relationship beteween variables is quadratic (no matter the statistical sugnificance of the linear term).
If the qaudratic term is not statistically significant, you might think of ruling out the 2nd order term and stick with the linear one only. In my opinion, you might also want to keep a non-significant qudratic term if you suspect that its lack of statistical significance has more to do with the limited sampe size under investigation than to the absence of a real non-linear relationship.

Thanks very much Carlo.Yes, if it is right (the quadratic term is significant but the liner term is not significant), how should we calculate the impact of this variable (e.g. cash) on the dependent variable? Is it also equal to beta1+2*beta2*cash? (I assume the equation is y = cons + beta1*cash+beta2*cash^2+error).

Bests regards,
wanhaiyou
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17699
#11

05 Nov 2015, 02:28

wanhaiyou:
yes, the equation should be

cons + beta1*cash+beta2*cash^2+error

if you have only cash and cash as predictors.
The next step should be the calculation of the turning point (maximum or minimum of the U-shaped function) using the formula provided by Clyde in post #8 (

= [-_b[cash]/(2*_b[cash#cash]

).
The sign of the linear term tells you whether the turning point of the U-shaped function is a max (the sign of the linear term is negative) or a min (the sign of the linear term is positive).
The coefficient of the quadratic terms explains in quantitattive terms the relationship between squared cash and the dependent variable.
Eventually, if the squared term is created via -fvvarlist- (highly advisable approach), you can also exploit the feautures of -margins- and -marginsplot-.

Kind regards,
Carlo
(Stata 19.0)
Comment
wanhaiyou

Join Date: May 2014

Posts: 130
#12

05 Nov 2015, 18:34

Hi Carlo, Thanks very much foy your clarify. I can not understand why we can calculate the impact of this variable (e.g. cash)
on the dependent variable (e.g. beta1+2*beta2*cash)when the cash term is not statistically significant.

Kind regards,wanhaiyou
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30060
#13

05 Nov 2015, 19:40

In a model without quadratic terms, say outcome = cons + beta1*cash + other linear terms + error, the effect of cash on the outcome is given by beta1. Note that beta1 is also the first derivative of outcome with respect to cash.

Now suppose your model is, as per Carlo, outcome = cons + beta1*cash+beta2*cash^2+error, with a quadratic term

The effect of an increase in cash on the outcome, at a given level of cash, is again given by the first derivative, d outcome/d cash. Standard calculus formulas show this to be

Code:

beta1 + 2*beta2*cash

Note also that in the quadratic model there is no such thing as "the effect of cash" on the outcome overall, because the effect very much varies with the level of cash itself.

I don't understand why you think it matters whether the coefficient of cash is statistically significant in this setting. With the quadratic model, the coefficient of cash is an artifact of where the cash variable is centered, and you can make that coefficient turn out to be any number at all (including 0) with a corresponding choice of centering. (See example in #4).
Comment
wanhaiyou

Join Date: May 2014

Posts: 130
#14

05 Nov 2015, 20:06

Hi Clyde
Thanks very much. As you say, we could drop the terms that the coefficients turn out to 0 (if I understand correctly).It is right? To my knowledge, the coefficient of variable related to the unit,
so we cannot say that the variable is not important when we transfer the unit from million dollar to dollar (at this time the coefficient become small ) if this variable is statistically significant.
Please forgive me if I understand not correct.

Kind regards,wanhaiyou
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30060
#15

05 Nov 2015, 22:04

It is not in general correct to drop terms with non-significant coefficients, or even with zero coefficients--statistical significance is only one consideration, and often the least important.

But if you are trying to decide whether a linear or quadratic model better describes your data, it is reasonable to drop the quadratic term if its coefficient is not statistically significant. If you do keep the quadratic term in the model, you then have to interpret everything else accordingly. And that means that the coefficient of the linear term cash is not the effect of a unit increment in cash on the outcome; rather you have to use what was shown to you by Carlo and me in #7, #11, and #13.

I'm not sure I understood your question, so I'm not sure I've answered it, but I hope I have.
Comment

Announcement

square independent variable in fixed effect regression

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment