variable omitted because of collinearity but command corr showing no multicollinearity issue

akhzan fasti

Join Date: Jun 2023

Posts: 24
#1

variable omitted because of collinearity but command corr showing no multicollinearity issue

05 Jun 2023, 20:55

Dear expert,
Kindly need your help
I'm currently running panel data regression using fixed effect (based on hausman test result) and 1 of my independent variable is omitted because of collinearity..
but when I perform multicollinearity test by using command "corr" there is no multicollinearity issue.
Why did stata omitted this variable from the model?

Regards

Last edited by akhzan fasti; 05 Jun 2023, 20:59.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#2

05 Jun 2023, 21:29

With neither example data nor exact Stata output to examine, it is hard to be certain. But the most common situation where a variable is unexpectedly omitted in a fixed effects regression is if it is constant within panels. For example, if your panel is firm and you had a variable for industry, which never changes for the same firm, it would be omitted. A similar situation can arise if you ran a two-way fixed effects model and included both the time-period fixed effects and some other variable that is constant across panels within any time period. For example, if you had both time fixed effects and a variable that indicated times during which there was a recession, the recession variable or one of the time indicators would be omitted.

If that is not the issue in your case, when posting back, show the exact regression command you gave, and the exact output it gave you, including any messages. Show them exactly as they were--use your computer's copy and paste functions to do that so that nothing is changed. Also post a data example which contains all of the variables necessary to run that regression, and which also exhibits the problem when it is run.

Use the -dataex- command to show the example data so that it can be used for testing and exploration. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

Added: The -corr- command is not a useful tool for identifying this kind of problem. A colinear relationship involving many variables does not necessarily entail a suspiciously high correlation among any pair of them.

Last edited by Clyde Schechter; 05 Jun 2023, 21:33.
3 likes
Comment
akhzan fasti

Join Date: Jun 2023

Posts: 24
#3

06 Jun 2023, 01:03

Hi Clyde,
Thank you for the explanation.. really appreciate that.
In my case, I have 6 predictors. These predictors denote dummies representing different government policy measures.
One of these predictors was omitted by stata because of collinearity. If the variable is omitted because constant within panels, then the other predictors must be omitted too. But it's not.. only 1 predictor was omitted..
here is the illustration of my data:

Attached Files

Last edited by akhzan fasti; 06 Jun 2023, 01:35.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#4

06 Jun 2023, 01:19

Akhzan:
as an aside to Clyde's excellent reply, it may be that the omitted predictor is simply collinear with another, not omitted, predictor.

Kind regards,
Carlo
(Stata 19.0)
Comment
akhzan fasti

Join Date: Jun 2023

Posts: 24
#5

06 Jun 2023, 01:59

can i just ignore the omitted variable and then run -xtgls- ? since my data have heterokedasticity issue also
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#6

06 Jun 2023, 02:06

Akhzan:
you're dealing with a T>N panel dataset.
Therefore, you should switch to -xtregar- or -xtgls-.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
akhzan fasti

Join Date: Jun 2023

Posts: 24
#7

06 Jun 2023, 02:30

Dear Carlo,
Thank you very much for your quick response.
I've switch to -xtregar- and the omitted variable is still there.. so do -xtgls-
can i just ignore it and use the output? the output of -xtgls- since I have heterokedasticity issue

Regards,
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17707

06 Jun 2023, 02:54

Akhzan:
you may want to consider something along the following lines:

Code:

. xtgls invest market stock i.company i.time, panels(hetero)

Cross-sectional time-series FGLS regression

Coefficients:  generalized least squares
Panels:        heteroskedastic
Correlation:   no autocorrelation

Estimated covariances      =         5          Number of obs     =        100
Estimated autocorrelations =         0          Number of groups  =          5
Estimated coefficients     =        26          Time periods      =         20
                                                Wald chi2(25)     =    1526.16
                                                Prob > chi2       =     0.0000

------------------------------------------------------------------------------
      invest | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
      market |   .1035231   .0194559     5.32   0.000     .0653903    .1416559
       stock |   .3743394   .0310829    12.04   0.000      .313418    .4352607
             |
     company |
          2  |   52.34137   72.41134     0.72   0.470    -89.58226     194.265
          3  |  -165.1098   50.58149    -3.26   0.001    -264.2477   -65.97186
          4  |   24.74629   72.91653     0.34   0.734    -118.1675    167.6601
          5  |   172.4509   52.39687     3.29   0.001     69.75491    275.1469
             |
        time |
          2  |   -22.1331   26.95746    -0.82   0.412    -74.96875    30.70255
          3  |   -47.3421   29.07938    -1.63   0.104    -104.3366    9.652437
          4  |  -35.10662   25.66363    -1.37   0.171    -85.40641    15.19317
          5  |  -57.57069   26.59627    -2.16   0.030    -109.6984   -5.442963
          6  |  -44.31086   27.09852    -1.64   0.102    -97.42299    8.801262
          7  |  -16.26337    26.4799    -0.61   0.539    -68.16301    35.63628
          8  |   -21.3159   25.68416    -0.83   0.407    -71.65592    29.02413
          9  |  -46.01592   26.29649    -1.75   0.080     -97.5561    5.524253
         10  |  -45.55398   26.56985    -1.71   0.086    -97.62993    6.521974
         11  |  -45.98728   27.67119    -1.66   0.097    -100.2218    8.247255
         12  |  -40.24119   28.20066    -1.43   0.154    -95.51347     15.0311
         13  |   -33.9678   26.24018    -1.29   0.195    -85.39761    17.46201
         14  |  -42.00009   26.63003    -1.58   0.115    -94.19399    10.19381
         15  |  -66.56644    26.6002    -2.50   0.012    -118.7019     -14.431
         16  |  -66.07868   27.01948    -2.45   0.014    -119.0359   -13.12147
         17  |  -42.73193   28.22123    -1.51   0.130    -98.04451    12.58066
         18  |  -59.31856   29.05539    -2.04   0.041    -116.2661   -2.371038
         19  |  -75.96848   32.51283    -2.34   0.019    -139.6925   -12.24451
         20  |  -99.60288   32.27268    -3.09   0.002    -162.8562   -36.34959
             |
       _cons |  -37.96412    78.1142    -0.49   0.627    -191.0651    115.1369
------------------------------------------------------------------------------

.

and then use -testparm- to test whether the categorical predictors are jointly significant.

Kind regards,
Carlo
(Stata 19.0)

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#9

06 Jun 2023, 09:55

Looking at the screenshot shown in #3, I observe that the variables x1 through x6 are dichotomous 0/1 variables that are mutually exclusive and exhaustive. That is, in each observation, there is exactly one of them equal to 1 and the others are all 0. So these are colinear because x1 + x2 + x3 + x4 + x5 + x6 = _cons. The fact that it was x5 that was omitted is happenstance. The colinearity affects all 6; one of them must be omitted, and it doesn't matter which one. Once that is done, effects of the remaining 5 must be interpreted as relative to the omitted category. This is no different from the way multi-level categorical variables are handled. If you have a variable that takes on three category levels, say, high, medium, and low, you would always have to represent that with two indicator variables, each of which then leads to an estimate of the effect of one of the categories relative to the unrepresented category.

Please re-read my advice in #2 about how to show example data, and also read the Forum FAQ, with emphasis on #12. Screenshots are strongly discouraged here. They are often unreadable, and even when readable usually do not provide enough information to resolve problems, and, in any case, cannot be imported to Stata so that solutions can be tried out and debugged. In your case, the screenshot was readable, and the information it contained was sufficient to suggest the source of the problem. But this will seldom be the case, and in the future, you will only diminish your chances of getting timely and helpful responses the first time you raise a question, without having to go back and repost with more information.
2 likes
Comment
akhzan fasti

Join Date: Jun 2023

Posts: 24
#10

09 Jun 2023, 21:38

Dear Clyde,
Thank you very much.. really appreciate that

Regards,
Comment
akhzan fasti

Join Date: Jun 2023

Posts: 24
#11

09 Jun 2023, 21:42

Dear Carlo,
Thank you for your suggestion.. really appreciate that

Regards,
Comment
akhzan fasti

Join Date: Jun 2023

Posts: 24
#12

19 Jun 2023, 06:10

Originally posted by Carlo Lazzaro View Post

Akhzan:
you're dealing with a T>N panel dataset.
Therefore, you should switch to -xtregar- or -xtgls-.

Dear Clyde and Carlo,
Kindly need your help..

The procedures I conducted before were:
xtset shares time
xtreg Y X1 X2 X3 X4 X5 X6 C1 C2 C3 C4 C5, fe
xtreg Y X1 X2 X3 X4 X5 X6 C1 C2 C3 C4 C5
estimates store fe
estimates store re
hausman fe re

the result: fe

and then I used command -corr- and -xttest3- to check multicollinearity and heteroskedasticity.
the output: multicollinearity: no. heteroskedasticity: yes

and then I used -xtgls- to address heteroskedasticity issue:
xtgls Y X1 X2 X3 X4 X5 X6 C1 C2 C3 C4 C5

-finish-

I used output from -xtgls- to analyze whether the categorical predictors are partially significant and used the coefficient value to analyze the way of relationship.
I used output from -xtreg fe- to analyze whether the categorical predictors are jointly significant

would you like to give me your opinion regarding my procedures above, please? was it appropriate?

Thank you very much
Best Regards,
Akhzan
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#13

19 Jun 2023, 07:44

Akhzan:
since you're dealing with a T>N panel dataset, -xtreg- is not a viable option.

Kind regards,
Carlo
(Stata 19.0)
Comment
akhzan fasti

Join Date: Jun 2023

Posts: 24
#14

19 Jun 2023, 08:06

Carlo:
Oke Carlo, thank you..
I've tried to follow your suggestion to use xtgls y x i.firm i.time, panels(hetero)
and then I used testparm but the output is as below:

. testparm turn psbb psbbt ppkm ppkmm ppkmd vaccine c1 c2 c3 c4 c5_

( 1) psbb = 0
( 2) psbbt = 0
( 3) ppkm = 0
( 4) ppkmm = 0
( 5) vaccine = 0
( 6) c1 = 0
( 7) c2 = 0
( 8) c3 = 0
( 9) c4 = 0
(10) c5_ = 0

chi2( 10) = 1975.24
Prob > chi2 = 0.0000

I thought the output should be something like: prob > F = ....
kindly need your assistance

addition: may i know the function to add i.firm and i.time in the command, please? because the P>|z| of my predictors (psbb, psbbt, ppkm, ppkmm, ppkmd, vaccine, c1, c2, c3, c4, and c5) is different with and without i.firm and i.time
I just want to analyze the significancy of my predictors variable, not as detail as every firm on specific date.. but I confuse because if I take out the i.firm and i.time, the P>|z| was changed

Thank you so much
Best Regards,
Fasti
Comment
akhzan fasti

Join Date: Jun 2023

Posts: 24
#15

19 Jun 2023, 09:22

Do you think something is wrong with my data? probably due to the predictors were denoted dummies
Comment

Announcement

variable omitted because of collinearity but command corr showing no multicollinearity issue

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment