omitted because of collinearity then how to interpret the result

yang yu

Join Date: Aug 2015

Posts: 13
#1

omitted because of collinearity then how to interpret the result

01 Aug 2015, 18:11

Dear all:

I am running a dissertation using stata 13 to investigate how critics(views and experts) influence box office. I initially divide views' critics(raverage) into four categories, so as experts' critics(av_expert).

use command:
egen raveragek=cut(raverage),group(4) label

tabulate raveragek

g cat_raverage=1 if raverage<=6.2

replace cat_raverage=2 if raverage>6.2 & raverage<=7.1

replace cat_raverage=3 if raverage>7.1 & raverage<=7.8

replace cat_raverage=4 if raverage>7.8 & raverage<=10

label define cat_reverage 1 "low" 2 "mid" 3 "upper" 4 "up"

label val cat_raverage cat_raverage

g cat_expert=1 if av_expert>1 & av_expert<=2.375

replace cat_expert=2 if av_expert>2.375 & av_expert<=3

replace cat_expert=3 if av_expert>3 & av_expert<=3.6

replace cat_expert=4 if av_expert>3.6 & av_expert<=5

label define cat_expert 1 "loww" 2 "midd" 3 "upperr" 4 "upp"

label val cat_expert cat_expert

Then i run xtreg fe command :
sort title_numeric week

by title_numeric:gen lagBO=Total_BO[_n-1]

duplicates drop title_numeric week,force

tsset title_numeric week

xtreg Total_BO i.cat_raverage i.cat_expert total_exp cinemas, fe

estimates store fe

there is a result shows
. tsset title_numeric week
panel variable: title_numeric (strongly balanced)
time variable: week, 2 to 56
delta: 1 unit

.
. xtreg Total_BO i.cat_raverage i.cat_expert total_exp cinemas, fe
note: 2.cat_expert omitted because of collinearity
note: 3.cat_expert omitted because of collinearity
note: 4.cat_expert omitted because of collinearity

Fixed-effects (within) regression Number of obs = 698
Group variable: title_nume~c Number of groups = 197

R-sq: within = 0.3164 Obs per group: min = 1
between = 0.2583 avg = 3.5
overall = 0.0457 max = 24

F(5,496) = 45.92
corr(u_i, Xb) = -0.4704 Prob > F = 0.0000

------------------------------------------------------------------------------
Total_BO | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cat_raverage |
2 | 121979.3 366679.3 0.33 0.740 -598456.8 842415.5
3 | -289544.7 432106 -0.67 0.503 -1138528 559439
4 | -480633.7 448387.3 -1.07 0.284 -1361606 400338.9
|
cat_expert |
midd | 0 (omitted)
upperr | 0 (omitted)
upp | 0 (omitted)
|
total_exp | .5948315 .3804444 1.56 0.119 -.1526498 1.342313
cinemas | -11791.95 810.0277 -14.56 0.000 -13383.46 -10200.44
_cons | 7083597 508517.7 13.93 0.000 6084483 8082711
-------------+----------------------------------------------------------------
sigma_u | 5660993
sigma_e | 2165343.4
rho | .87236583 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(196, 496) = 17.91 Prob > F = 0.0000

After i run hauman, i find out i should use fixed effect model.

but the problem is how can i interpret the results for cat_expert because they are all omitted??or is there something wrong with my result?

Thanks
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17105
#2

02 Aug 2015, 01:20

Yang:
I suspect that cat_expert overlaps total_exp.
If this is the case, you should omit one of them; otherwise, Stata will do it on your behalf.

Kind regards,
Carlo
(StataNow 18.5)
Comment
yang yu

Join Date: Aug 2015

Posts: 13
#3

02 Aug 2015, 05:51

Originally posted by Carlo Lazzaro View Post

Yang:
I suspect that cat_expert overlaps total_exp.
If this is the case, you should omit one of them; otherwise, Stata will do it on your behalf.

Hi Carlo,
Thank you very much for your helps again. i try to do lots of tests to check what's the problem with that. no matter which variable i omit, the collinearity still happens with the same way. at last i run only "xtreg Total_BO i.cat_expert , fe" it is actually the same!!!!!

. xtreg Total_BO i.cat_expert , fe
note: 2.cat_expert omitted because of collinearity
note: 3.cat_expert omitted because of collinearity
note: 4.cat_expert omitted because of collinearity

Fixed-effects (within) regression Number of obs = 1083
Group variable: title_nume~c Number of groups = 325

R-sq: within = 0.0000 Obs per group: min = 1
between = 0.0442 avg = 3.3
overall = . max = 23

F(0,758) = 0.00
corr(u_i, Xb) = . Prob > F = .

------------------------------------------------------------------------------
Total_BO | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cat_expert |
midd | 0 (omitted)
upperr | 0 (omitted)
upp | 0 (omitted)
|
_cons | 6006611 96549.48 62.21 0.000 5817074 6196147
-------------+----------------------------------------------------------------
sigma_u | 5962414.5
sigma_e | 3177343.6
rho | .77882981 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(324, 758) = 31.37 Prob > F = 0.0000

.
. estimates store fe

.
. xtreg Total_BO i.cat_expert , re

Random-effects GLS regression Number of obs = 1083
Group variable: title_nume~c Number of groups = 325

R-sq: within = 0.0000 Obs per group: min = 1
between = 0.0120 avg = 3.3
overall = 0.0067 max = 23

Wald chi2(3) = 4.54
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.2087

------------------------------------------------------------------------------
Total_BO | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cat_expert |
midd | -656884.1 894098 -0.73 0.463 -2409284 1095516
upperr | 1367830 981605.2 1.39 0.163 -556080.7 3291741
upp | 315916.1 953871.2 0.33 0.740 -1553637 2185469
|
_cons | 2574030 647017.3 3.98 0.000 1305899 3842160
-------------+----------------------------------------------------------------
sigma_u | 5393760.7
sigma_e | 3177343.6
rho | .74238365 (fraction of variance due to u_i)
-------------------------------------------------------------------------
it is so weird......

i start to doubt if it is the problem with the method when i try to divide av_expert into categories.

could you help me if the command is ok as follows:

g cat_expert=1 if av_expert>1 & av_expert<=2.375

replace cat_expert=2 if av_expert>2.375 & av_expert<=3

replace cat_expert=3 if av_expert>3 & av_expert<=3.6

replace cat_expert=4 if av_expert>3.6 & av_expert<=5

label define cat_expert 1 "loww" 2 "midd" 3 "upperr" 4 "upp"

label val cat_expert cat_expert

Thank you very much!!
Comment
yang yu

Join Date: Aug 2015

Posts: 13
#4

02 Aug 2015, 05:57

Originally posted by Carlo Lazzaro View Post

Yang:
I suspect that cat_expert overlaps total_exp.
If this is the case, you should omit one of them; otherwise, Stata will do it on your behalf.

Even i don't divide it into categories.

when i do nothing and run "xtreg Total_BO av_expert , fe"

it still happens!!!

. xtreg Total_BO av_expert , fe
note: av_expert omitted because of collinearity

Fixed-effects (within) regression Number of obs = 1137
Group variable: title_nume~c Number of groups = 338

R-sq: within = 0.0000 Obs per group: min = 1
between = 0.0128 avg = 3.4
overall = . max = 22

F(0,799) = 0.00
corr(u_i, Xb) = . Prob > F = .

------------------------------------------------------------------------------
Total_BO | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
av_expert | 0 (omitted)
_cons | 5901429 102772.7 57.42 0.000 5699693 6103165
-------------+----------------------------------------------------------------
sigma_u | 5746953.4
sigma_e | 3465435.5
rho | .7333455 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(337, 799) = 25.32 Prob > F = 0.0000

so weird...there is no problem with the dataset i believe.....
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17105
#5

02 Aug 2015, 06:02

Yang:
it mightbe that av_expert is collinear with -fe-:
Check how you did -xtset- your data.

Kind regards,
Carlo
(StataNow 18.5)
Comment
yang yu

Join Date: Aug 2015

Posts: 13
#6

02 Aug 2015, 06:38

Originally posted by Carlo Lazzaro View Post

Yang:
it mightbe that av_expert is collinear with -fe-:
Check how you did -xtset- your data.

Dear Carlo:
Thank you for your reply. i check it but there is nothing i can find out can arise the problem...i am far not as good as you on this...could you help me to check the commands i use, if there is some wrong? thanks a lot. all the commands i use is as follows:

egen raveragek=cut(raverage),group(4) label

tabulate raveragek

g cat_raverage=1 if raverage<=6.2

replace cat_raverage=2 if raverage>6.2 & raverage<=7.1

replace cat_raverage=3 if raverage>7.1 & raverage<=7.8

replace cat_raverage=4 if raverage>7.8 & raverage<=10

label define cat_reverage 1 "low" 2 "mid" 3 "upper" 4 "up"

label val cat_raverage cat_raverage

g cat_expert=1 if av_expert>1 & av_expert<=2.375

replace cat_expert=2 if av_expert>2.375 & av_expert<=3

replace cat_expert=3 if av_expert>3 & av_expert<=3.6

replace cat_expert=4 if av_expert>3.6 & av_expert<=5

label define cat_expert 1 "loww" 2 "midd" 3 "upperr" 4 "upp"

label val cat_expert cat_expert

encode title, gen(title_numeric)

sort title_numeric week

by title_numeric:gen lagBO=Total_BO[_n-1]

duplicates drop title_numeric week,force

tsset title_numeric week

xtreg Total_BO i.cat_raverage i.cat_expert total_exp cinemas, fe

estimates store fe

xtreg Total_BO i.cat_raverage i.cat_expert total_exp cinemas, re

estimates store re

hausman fe re,sigmamore
Comment
yang yu

Join Date: Aug 2015

Posts: 13
#7

02 Aug 2015, 06:42

Originally posted by Carlo Lazzaro View Post

Yang:
it mightbe that av_expert is collinear with -fe-:
Check how you did -xtset- your data.

i run these commands. the result is the same.

encode title, gen(title_numeric)

sort title_numeric week

by title_numeric:gen lagBO=Total_BO[_n-1]

duplicates drop title_numeric week,force

tsset title_numeric week

xtreg Total_BO av_expert , fe
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17105
#8

02 Aug 2015, 07:57

Yang:
why did you use -encode- to create your panel_id?

Kind regards,
Carlo
(StataNow 18.5)
Comment
yang yu

Join Date: Aug 2015

Posts: 13
#9

02 Aug 2015, 08:12

Originally posted by Carlo Lazzaro View Post

Yang:
why did you use -encode- to create your panel_id?

because my title is the name of films,which are string variables.

title
2 days in new york
2 days in new york
2 days in new york
2 days in new york
2 days in new york
2 days in new york
2 days in new york
21 jump street
21 jump street
21 jump street
21 jump street
21 jump street
21 jump street
.....
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17105
#10

02 Aug 2015, 09:03

Yang:
did you check that -encode- always lists the same title with the same number?

Kind regards,
Carlo
(StataNow 18.5)
Comment
yang yu

Join Date: Aug 2015

Posts: 13
#11

02 Aug 2015, 09:31

Originally posted by Carlo Lazzaro View Post

Yang:
did you check that -encode- always lists the same title with the same number?

Dear Carlo:

Yes, i check it, it is always the same.....i ma crazy about this....
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17105
#12

02 Aug 2015, 09:36

Yang:
at this point, I would consider replacing the collinear predictor in your regression with some other independent variable (provided this is a feasible approach).

Kind regards,
Carlo
(StataNow 18.5)
Comment
yang yu

Join Date: Aug 2015

Posts: 13
#13

02 Aug 2015, 09:50

Originally posted by Carlo Lazzaro View Post

Yang:
at this point, I would consider replacing the collinear predictor in your regression with some other independent variable (provided this is a feasible approach).

Dear Carlo:
Thanks for your quick reply .
you mean i don't test this variable any more. but my aim is to test if critics influence the total box office. and critics include experts'critics and views' critics. if i omit experts' critics, it doesn't look quite good.

but if this is the only way i can deal with it, then i have to do it....
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#14

02 Aug 2015, 10:01

Based on the xtreg results in post #3 above, I think the problem is that for every panel (every value of title_numeric), every observation in the panel has the same value for cat_expert. When you have a fixed effects model, anything that's the same for every value of a panel will be collinear with the fixed effects. You construct cat_expert from a variable called av_expert: is that an average value of some sort, where the average was taken within each separate title?
Comment
yang yu

Join Date: Aug 2015

Posts: 13
#15

02 Aug 2015, 10:12

Originally posted by William Lisowski View Post

Based on the xtreg results in post #3 above, I think the problem is that for every panel (every value of title_numeric), every observation in the panel has the same value for cat_expert. When you have a fixed effects model, anything that's the same for every value of a panel will be collinear with the fixed effects. You construct cat_expert from a variable called av_expert: is that an average value of some sort, where the average was taken within each separate title?

William

Thanks for your reply.

Yes, the av_expert is the average of experts rating. but the other variable raverage for views' rating is average as well. but raverage is ok.
do you mean there is some problems with the variable av_expert itself?
Comment

Announcement

omitted because of collinearity then how to interpret the result

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment