Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • omitted because of collinearity then how to interpret the result

    Dear all:

    I am running a dissertation using stata 13 to investigate how critics(views and experts) influence box office. I initially divide views' critics(raverage) into four categories, so as experts' critics(av_expert).

    use command:
    egen raveragek=cut(raverage),group(4) label

    tabulate raveragek

    g cat_raverage=1 if raverage<=6.2

    replace cat_raverage=2 if raverage>6.2 & raverage<=7.1

    replace cat_raverage=3 if raverage>7.1 & raverage<=7.8

    replace cat_raverage=4 if raverage>7.8 & raverage<=10

    label define cat_reverage 1 "low" 2 "mid" 3 "upper" 4 "up"

    label val cat_raverage cat_raverage

    g cat_expert=1 if av_expert>1 & av_expert<=2.375

    replace cat_expert=2 if av_expert>2.375 & av_expert<=3

    replace cat_expert=3 if av_expert>3 & av_expert<=3.6

    replace cat_expert=4 if av_expert>3.6 & av_expert<=5

    label define cat_expert 1 "loww" 2 "midd" 3 "upperr" 4 "upp"

    label val cat_expert cat_expert

    Then i run xtreg fe command :
    sort title_numeric week

    by title_numeric:gen lagBO=Total_BO[_n-1]

    duplicates drop title_numeric week,force

    tsset title_numeric week

    xtreg Total_BO i.cat_raverage i.cat_expert total_exp cinemas, fe

    estimates store fe

    there is a result shows
    . tsset title_numeric week
    panel variable: title_numeric (strongly balanced)
    time variable: week, 2 to 56
    delta: 1 unit

    .
    . xtreg Total_BO i.cat_raverage i.cat_expert total_exp cinemas, fe
    note: 2.cat_expert omitted because of collinearity
    note: 3.cat_expert omitted because of collinearity
    note: 4.cat_expert omitted because of collinearity

    Fixed-effects (within) regression Number of obs = 698
    Group variable: title_nume~c Number of groups = 197

    R-sq: within = 0.3164 Obs per group: min = 1
    between = 0.2583 avg = 3.5
    overall = 0.0457 max = 24

    F(5,496) = 45.92
    corr(u_i, Xb) = -0.4704 Prob > F = 0.0000

    ------------------------------------------------------------------------------
    Total_BO | Coef. Std. Err. t P>|t| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    cat_raverage |
    2 | 121979.3 366679.3 0.33 0.740 -598456.8 842415.5
    3 | -289544.7 432106 -0.67 0.503 -1138528 559439
    4 | -480633.7 448387.3 -1.07 0.284 -1361606 400338.9
    |
    cat_expert |
    midd | 0 (omitted)
    upperr | 0 (omitted)
    upp | 0 (omitted)
    |
    total_exp | .5948315 .3804444 1.56 0.119 -.1526498 1.342313
    cinemas | -11791.95 810.0277 -14.56 0.000 -13383.46 -10200.44
    _cons | 7083597 508517.7 13.93 0.000 6084483 8082711
    -------------+----------------------------------------------------------------
    sigma_u | 5660993
    sigma_e | 2165343.4
    rho | .87236583 (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    F test that all u_i=0: F(196, 496) = 17.91 Prob > F = 0.0000

    After i run hauman, i find out i should use fixed effect model.

    but the problem is how can i interpret the results for cat_expert because they are all omitted??or is there something wrong with my result?

    Thanks


  • #2
    Yang:
    I suspect that cat_expert overlaps total_exp.
    If this is the case, you should omit one of them; otherwise, Stata will do it on your behalf.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Originally posted by Carlo Lazzaro View Post
      Yang:
      I suspect that cat_expert overlaps total_exp.
      If this is the case, you should omit one of them; otherwise, Stata will do it on your behalf.
      Hi Carlo,
      Thank you very much for your helps again. i try to do lots of tests to check what's the problem with that. no matter which variable i omit, the collinearity still happens with the same way. at last i run only "xtreg Total_BO i.cat_expert , fe" it is actually the same!!!!!

      . xtreg Total_BO i.cat_expert , fe
      note: 2.cat_expert omitted because of collinearity
      note: 3.cat_expert omitted because of collinearity
      note: 4.cat_expert omitted because of collinearity

      Fixed-effects (within) regression Number of obs = 1083
      Group variable: title_nume~c Number of groups = 325

      R-sq: within = 0.0000 Obs per group: min = 1
      between = 0.0442 avg = 3.3
      overall = . max = 23

      F(0,758) = 0.00
      corr(u_i, Xb) = . Prob > F = .

      ------------------------------------------------------------------------------
      Total_BO | Coef. Std. Err. t P>|t| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      cat_expert |
      midd | 0 (omitted)
      upperr | 0 (omitted)
      upp | 0 (omitted)
      |
      _cons | 6006611 96549.48 62.21 0.000 5817074 6196147
      -------------+----------------------------------------------------------------
      sigma_u | 5962414.5
      sigma_e | 3177343.6
      rho | .77882981 (fraction of variance due to u_i)
      ------------------------------------------------------------------------------
      F test that all u_i=0: F(324, 758) = 31.37 Prob > F = 0.0000

      .
      . estimates store fe

      .
      . xtreg Total_BO i.cat_expert , re

      Random-effects GLS regression Number of obs = 1083
      Group variable: title_nume~c Number of groups = 325

      R-sq: within = 0.0000 Obs per group: min = 1
      between = 0.0120 avg = 3.3
      overall = 0.0067 max = 23

      Wald chi2(3) = 4.54
      corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.2087

      ------------------------------------------------------------------------------
      Total_BO | Coef. Std. Err. z P>|z| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      cat_expert |
      midd | -656884.1 894098 -0.73 0.463 -2409284 1095516
      upperr | 1367830 981605.2 1.39 0.163 -556080.7 3291741
      upp | 315916.1 953871.2 0.33 0.740 -1553637 2185469
      |
      _cons | 2574030 647017.3 3.98 0.000 1305899 3842160
      -------------+----------------------------------------------------------------
      sigma_u | 5393760.7
      sigma_e | 3177343.6
      rho | .74238365 (fraction of variance due to u_i)
      -------------------------------------------------------------------------
      it is so weird......

      i start to doubt if it is the problem with the method when i try to divide av_expert into categories.

      could you help me if the command is ok as follows
      :

      g cat_expert=1 if av_expert>1 & av_expert<=2.375

      replace cat_expert=2 if av_expert>2.375 & av_expert<=3

      replace cat_expert=3 if av_expert>3 & av_expert<=3.6

      replace cat_expert=4 if av_expert>3.6 & av_expert<=5

      label define cat_expert 1 "loww" 2 "midd" 3 "upperr" 4 "upp"

      label val cat_expert cat_expert

      Thank you very much!!

      Comment


      • #4
        Originally posted by Carlo Lazzaro View Post
        Yang:
        I suspect that cat_expert overlaps total_exp.
        If this is the case, you should omit one of them; otherwise, Stata will do it on your behalf.
        Even i don't divide it into categories.

        when i do nothing and run "xtreg Total_BO av_expert , fe"

        it still happens!!!

        . xtreg Total_BO av_expert , fe
        note: av_expert omitted because of collinearity

        Fixed-effects (within) regression Number of obs = 1137
        Group variable: title_nume~c Number of groups = 338

        R-sq: within = 0.0000 Obs per group: min = 1
        between = 0.0128 avg = 3.4
        overall = . max = 22

        F(0,799) = 0.00
        corr(u_i, Xb) = . Prob > F = .

        ------------------------------------------------------------------------------
        Total_BO | Coef. Std. Err. t P>|t| [95% Conf. Interval]
        -------------+----------------------------------------------------------------
        av_expert | 0 (omitted)
        _cons | 5901429 102772.7 57.42 0.000 5699693 6103165
        -------------+----------------------------------------------------------------
        sigma_u | 5746953.4
        sigma_e | 3465435.5
        rho | .7333455 (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        F test that all u_i=0: F(337, 799) = 25.32 Prob > F = 0.0000


        so weird...there is no problem with the dataset i believe.....

        Comment


        • #5
          Yang:
          it mightbe that av_expert is collinear with -fe-:
          Check how you did -xtset- your data.
          Kind regards,
          Carlo
          (StataNow 18.5)

          Comment


          • #6
            Originally posted by Carlo Lazzaro View Post
            Yang:
            it mightbe that av_expert is collinear with -fe-:
            Check how you did -xtset- your data.
            Dear Carlo:
            Thank you for your reply. i check it but there is nothing i can find out can arise the problem...i am far not as good as you on this...could you help me to check the commands i use, if there is some wrong? thanks a lot. all the commands i use is as follows:

            egen raveragek=cut(raverage),group(4) label

            tabulate raveragek

            g cat_raverage=1 if raverage<=6.2

            replace cat_raverage=2 if raverage>6.2 & raverage<=7.1

            replace cat_raverage=3 if raverage>7.1 & raverage<=7.8

            replace cat_raverage=4 if raverage>7.8 & raverage<=10

            label define cat_reverage 1 "low" 2 "mid" 3 "upper" 4 "up"

            label val cat_raverage cat_raverage

            g cat_expert=1 if av_expert>1 & av_expert<=2.375

            replace cat_expert=2 if av_expert>2.375 & av_expert<=3

            replace cat_expert=3 if av_expert>3 & av_expert<=3.6

            replace cat_expert=4 if av_expert>3.6 & av_expert<=5

            label define cat_expert 1 "loww" 2 "midd" 3 "upperr" 4 "upp"

            label val cat_expert cat_expert

            encode title, gen(title_numeric)

            sort title_numeric week

            by title_numeric:gen lagBO=Total_BO[_n-1]

            duplicates drop title_numeric week,force

            tsset title_numeric week

            xtreg Total_BO i.cat_raverage i.cat_expert total_exp cinemas, fe

            estimates store fe

            xtreg Total_BO i.cat_raverage i.cat_expert total_exp cinemas, re

            estimates store re

            hausman fe re,sigmamore

            Comment


            • #7
              Originally posted by Carlo Lazzaro View Post
              Yang:
              it mightbe that av_expert is collinear with -fe-:
              Check how you did -xtset- your data.
              i run these commands. the result is the same.

              encode title, gen(title_numeric)

              sort title_numeric week

              by title_numeric:gen lagBO=Total_BO[_n-1]

              duplicates drop title_numeric week,force

              tsset title_numeric week

              xtreg Total_BO av_expert , fe

              Comment


              • #8
                Yang:
                why did you use -encode- to create your panel_id?
                Kind regards,
                Carlo
                (StataNow 18.5)

                Comment


                • #9
                  Originally posted by Carlo Lazzaro View Post
                  Yang:
                  why did you use -encode- to create your panel_id?
                  because my title is the name of films,which are string variables.

                  title
                  2 days in new york
                  2 days in new york
                  2 days in new york
                  2 days in new york
                  2 days in new york
                  2 days in new york
                  2 days in new york
                  21 jump street
                  21 jump street
                  21 jump street
                  21 jump street
                  21 jump street
                  21 jump street
                  .....

                  Comment


                  • #10
                    Yang:
                    did you check that -encode- always lists the same title with the same number?
                    Kind regards,
                    Carlo
                    (StataNow 18.5)

                    Comment


                    • #11
                      Originally posted by Carlo Lazzaro View Post
                      Yang:
                      did you check that -encode- always lists the same title with the same number?
                      Dear Carlo:

                      Yes, i check it, it is always the same.....i ma crazy about this....

                      Comment


                      • #12
                        Yang:
                        at this point, I would consider replacing the collinear predictor in your regression with some other independent variable (provided this is a feasible approach).
                        Kind regards,
                        Carlo
                        (StataNow 18.5)

                        Comment


                        • #13
                          Originally posted by Carlo Lazzaro View Post
                          Yang:
                          at this point, I would consider replacing the collinear predictor in your regression with some other independent variable (provided this is a feasible approach).
                          Dear Carlo:
                          Thanks for your quick reply .
                          you mean i don't test this variable any more. but my aim is to test if critics influence the total box office. and critics include experts'critics and views' critics. if i omit experts' critics, it doesn't look quite good.

                          but if this is the only way i can deal with it, then i have to do it....

                          Comment


                          • #14
                            Based on the xtreg results in post #3 above, I think the problem is that for every panel (every value of title_numeric), every observation in the panel has the same value for cat_expert. When you have a fixed effects model, anything that's the same for every value of a panel will be collinear with the fixed effects. You construct cat_expert from a variable called av_expert: is that an average value of some sort, where the average was taken within each separate title?

                            Comment


                            • #15
                              Originally posted by William Lisowski View Post
                              Based on the xtreg results in post #3 above, I think the problem is that for every panel (every value of title_numeric), every observation in the panel has the same value for cat_expert. When you have a fixed effects model, anything that's the same for every value of a panel will be collinear with the fixed effects. You construct cat_expert from a variable called av_expert: is that an average value of some sort, where the average was taken within each separate title?
                              William

                              Thanks for your reply.

                              Yes, the av_expert is the average of experts rating. but the other variable raverage for views' rating is average as well. but raverage is ok.
                              do you mean there is some problems with the variable av_expert itself?

                              Comment

                              Working...
                              X