Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Omitted because of collinearity.

    Hello everyone,

    I currently have a panel dataset with the ID variable and Year variable. I am missing an Annual Unemployment Rate variable in my data; therefore, I merge my current data with separate data containing the Annual Unemployment. An example of the separate data containing the Annual Unemployment is as follows:
    Code:
    clear
    input Year un_rate  
    2000 0.0690
    2002 0.0565
    2003 0.0422
    2005 0.0963
    end
    And I use the m:1 merging technique, with the key variable being Year, to merge my current data with that data.

    However, when I run a regression:
    Code:
    reg y x1 x2 i.year un_rate
    it returns an error of "omitted because of collinearity." I include "i.year" because I want to include the time dummy variable to capture the year-fixed effect. I see some papers that include the annual unemployment rate and time dummy variable in one regression as well, so I do not understand why I have this problem. Does anyone know why I have this problem since I fail to understand why Year and un_rate have collinearity?

    Are there any solutions to this problem? Thank you so much in advance.
    Last edited by Duy To; 19 Mar 2023, 23:47.

  • #2
    Have you tried centering one or both of these? Compute the mean of the variable's data (via, for example, summarize var, meanonly) and then subtract it, creating a new centered version of the variable.

    Try the unemployment rate first and see whether that eliminates the problem. (You might need to check also against your other two predictors.) If not, then try centering the year, too, but once you do, it will likely need to be included in the model as a continuous predictor.

    Comment


    • #3
      Hello again everyone,

      So I have tried a couple of things and see how it does not work.

      So basically, as long as one of the variables in the list is time-invariant (e.g., gender), and it is in the regression, it keeps returning "omitted because of collinearity." However, I still do not understand the reason behind this issue.

      Comment


      • #4
        Hi Joseph, I have tried your suggestion, but the issue persists. :<

        Comment


        • #5
          Hi, it's me, again!

          So, I have played around with everything, and this is what i got. If the command is like this:

          Code:
          reg y x1 x2 i.year un_rate // it does not work
          But, if the command is like this:

          Code:
          reg y x1 x2  un_rate i.year // it works
          Are the indicator variables like i.year supposed to be at the end of the command?

          Comment


          • #6
            Duy:
            but if you actually have a panel dataset, why not going -xtreg,fe-?
            In addition, you went -regress- but did you included -i.panelvar- as a predictor?
            In addition, I cannot replicate your problem:
            Code:
            . use "https://www.stata-press.com/data/r17/nlswork.dta"
            (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
            
            . reg ln_wage age i.year i.idcode if idcode<=3
            
                  Source |       SS           df       MS      Number of obs   =        39
            -------------+----------------------------------   F(17, 21)       =      2.68
                   Model |  3.54194923        17  .208349955   Prob > F        =    0.0171
                Residual |  1.63378973        21  .077799511   R-squared       =    0.6843
            -------------+----------------------------------   Adj R-squared   =    0.4288
                   Total |  5.17573896        38  .136203657   Root MSE        =    .27893
            
            ------------------------------------------------------------------------------
                 ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
            -------------+----------------------------------------------------------------
                     age |   .3010572   .3561559     0.85   0.407    -.4396095    1.041724
                         |
                    year |
                     69  |  -.0920902   .5314565    -0.17   0.864    -1.197315    1.013134
                     70  |  -.8648493    .779214    -1.11   0.280    -2.485314    .7556149
                     71  |  -1.248506    1.09967    -1.14   0.269    -3.535396    1.038383
                     72  |   -1.39387   1.443494    -0.97   0.345    -4.395779     1.60804
                     73  |  -1.520276    1.79214    -0.85   0.406    -5.247236    2.206684
                     75  |  -2.049717   2.495803    -0.82   0.421    -7.240024     3.14059
                     77  |  -2.657565   3.203292    -0.83   0.416    -9.319175    4.004045
                     78  |  -2.751196   3.557758    -0.77   0.448    -10.14996    4.647567
                     80  |  -3.324016   4.267534    -0.78   0.445    -12.19884    5.550808
                     82  |  -4.027975   4.983977    -0.81   0.428    -14.39272    6.336774
                     83  |  -4.207353   5.333467    -0.79   0.439     -15.2989    6.884199
                     85  |  -4.730657   6.044586    -0.78   0.443    -17.30106    7.839747
                     87  |  -5.407995   6.755956    -0.80   0.432    -19.45777    8.641785
                     88  |  -5.901929   7.348904    -0.80   0.431    -21.18481    9.380954
                         |
                  idcode |
                      2  |  -.3898423     .11632    -3.35   0.003     -.631743   -.1479415
                      3  |  -2.247118   2.111457    -1.06   0.299    -6.638133    2.143897
                         |
                   _cons |  -2.882579   5.734884    -0.50   0.620    -14.80892    9.043766
            ------------------------------------------------------------------------------
            
            . reg ln_wage age i.idcode i.year if idcode<=3
            
                  Source |       SS           df       MS      Number of obs   =        39
            -------------+----------------------------------   F(17, 21)       =      2.68
                   Model |  3.54194923        17  .208349955   Prob > F        =    0.0171
                Residual |  1.63378973        21  .077799511   R-squared       =    0.6843
            -------------+----------------------------------   Adj R-squared   =    0.4288
                   Total |  5.17573896        38  .136203657   Root MSE        =    .27893
            
            ------------------------------------------------------------------------------
                 ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
            -------------+----------------------------------------------------------------
                     age |   .3010572   .3561559     0.85   0.407    -.4396095    1.041724
                         |
                  idcode |
                      2  |  -.3898423     .11632    -3.35   0.003     -.631743   -.1479415
                      3  |  -2.247118   2.111457    -1.06   0.299    -6.638133    2.143897
                         |
                    year |
                     69  |  -.0920902   .5314565    -0.17   0.864    -1.197315    1.013134
                     70  |  -.8648493    .779214    -1.11   0.280    -2.485314    .7556149
                     71  |  -1.248506    1.09967    -1.14   0.269    -3.535396    1.038383
                     72  |   -1.39387   1.443494    -0.97   0.345    -4.395779     1.60804
                     73  |  -1.520276    1.79214    -0.85   0.406    -5.247236    2.206684
                     75  |  -2.049717   2.495803    -0.82   0.421    -7.240024     3.14059
                     77  |  -2.657565   3.203292    -0.83   0.416    -9.319175    4.004045
                     78  |  -2.751196   3.557758    -0.77   0.448    -10.14996    4.647567
                     80  |  -3.324016   4.267534    -0.78   0.445    -12.19884    5.550808
                     82  |  -4.027975   4.983977    -0.81   0.428    -14.39272    6.336774
                     83  |  -4.207353   5.333467    -0.79   0.439     -15.2989    6.884199
                     85  |  -4.730657   6.044586    -0.78   0.443    -17.30106    7.839747
                     87  |  -5.407995   6.755956    -0.80   0.432    -19.45777    8.641785
                     88  |  -5.901929   7.348904    -0.80   0.431    -21.18481    9.380954
                         |
                   _cons |  -2.882579   5.734884    -0.50   0.620    -14.80892    9.043766
            ------------------------------------------------------------------------------
            
            .
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Hi Carlo,

              Thank you for your reply. It is just my habit that I go model by model from the most basic one (e.g., - reg-) to the more advanced one (e.g., -xtreg-) to see if there are any abnormal things presenting (e.g., unreasonable signs of coefficients etc.). But everything is good now. Thank you for your reply again.

              Comment


              • #8
                Duy:
                I see your point.
                But your -regress- code did not show a panel data regression with -fe- specification, as you -i.panelvar- was not included among the predictors.
                More substantively, happy with reading that everything is fine now.
                It would also be interesting to see what the problem was and how you fixed it. Thanks.
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment

                Working...
                X