Good morning,
I use Stata 13, I believe.
I have a gravity model, and I am estimating the effects of Free trade Agreements (FTAs) on trade/exports between the US states and the rest of the world (ROW). My independent variables include GDP in the exporting and importing countries, Population (both partners Exp and Imp), Production
2-OLS NAFTA and other independent vars
I have decided to run some multicolinearity diagnostics and below
Now I removed the variables above just to see what will happen (I have to say all these variables are really important here, hence I do not want to remove them.) the regression output seem ok. I also did the correlation matrix and it gives the output below.
I would like to know if there is a way to solve this issue without having to remove any of my variables?
Could you please tell me if there is another regression diagnostic I should do instead of focusing only on high collinearity? Thanks for you help.
Thanks for help
I use Stata 13, I believe.
I have a gravity model, and I am estimating the effects of Free trade Agreements (FTAs) on trade/exports between the US states and the rest of the world (ROW). My independent variables include GDP in the exporting and importing countries, Population (both partners Exp and Imp), Production
(both partners Exp and Imp), farm income by state, Distance... and the FTAs (NAFTA, ASEAN...). There are two dummies for each of the FTAs where mm is a subscript for trade between members, and mn is for trade from a state to the ROW.
I am really interested in NAFTA signs here: Expected sign NAFTAmm is>0 and NAFTAmn is<0).
I am using both static (OLS, Fixed Effects, Two State gravity) and GMM. for the issue I am facing I am only going to show commands and output of OLS regression);
I have first run a model with only NAFTA and obtain the correct expected signs, then when I add all the other independent variables in every estimation from OLS to GMM, the signs of coefficients NAFTA are negative (see output below). . WHY?
Why does NAFTAmm goes from positive to Negative with other variables? Collinearity problem (continue below)
1- OLS with NAFTA only
I am really interested in NAFTA signs here: Expected sign NAFTAmm is>0 and NAFTAmn is<0).
I am using both static (OLS, Fixed Effects, Two State gravity) and GMM. for the issue I am facing I am only going to show commands and output of OLS regression);
I have first run a model with only NAFTA and obtain the correct expected signs, then when I add all the other independent variables in every estimation from OLS to GMM, the signs of coefficients NAFTA are negative (see output below). . WHY?
Why does NAFTAmm goes from positive to Negative with other variables? Collinearity problem (continue below)
1- OLS with NAFTA only
Code:
reg lnStateExports NAFTAmm NAFTAmn, robustLinear regression Number of obs = 232755F( 2,232752) = 171.17Prob > F = 0.0000R-squared = 0.0019Root MSE = 2.264------------------------------------------------------------------------------| RobustlnStateExp~s | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------NAFTAmm | .0694875 .0399313 1.74 0.082 -.0087768 .1477517NAFTAmn | -.4145053 .022455 -18.46 0.000 -.4585166 -.3704941_cons | 13.52012 .0219329 616.43 0.000 13.47714 13.56311------------------------------------------------------------------------------
2-OLS NAFTA and other independent vars
Code:
reg lnStateExports NAFTAmm NAFTAmn lnExpGDP lnImpGDP lnExpPop lnImpPop lnExpProd lnImpProd lnFarmIn> c lnDistance BorderDummy, robustLinear regression Number of obs = 46477F( 11, 46465) = 1497.94Prob > F = 0.0000R-squared = 0.2393Root MSE = 1.9851------------------------------------------------------------------------------| RobustlnStateExp~s | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------NAFTAmm | -1.46886 .1077491 -13.63 0.000 -1.68005 -1.25767NAFTAmn | -.0150688 .0482547 -0.31 0.755 -.1096486 .0795111lnExpGDP | -.1744397 .0581043 -3.00 0.003 -.288325 -.0605544lnImpGDP | .4816379 .0085027 56.65 0.000 .4649725 .4983034lnExpPop | .877131 .0642196 13.66 0.000 .7512596 1.003002lnImpPop | .1129621 .0124843 9.05 0.000 .0884926 .1374315lnExpProd | -.3711653 .0177531 -20.91 0.000 -.4059615 -.336369lnImpProd | -.0925821 .0098372 -9.41 0.000 -.1118632 -.0733009lnFarmInc | .6124517 .0176502 34.70 0.000 .577857 .6470464lnDistance | -.3987482 .0200808 -19.86 0.000 -.4381069 -.3593895BorderDummy | 2.299835 .2741551 8.39 0.000 1.762487 2.837183_cons | -10.49608 .3624592 -28.96 0.000 -11.2065 -9.785652------------------------------------------------------------------------------
I have decided to run some multicolinearity diagnostics and below
Code:
vifVariable | VIF 1/VIF-------------+----------------------lnExpPop | 37.17 0.026906 high multicolinearitylnExpGDP | 35.37 0.028275 high multicolinearitylnImpProd | 4.53 0.220672lnImpPop | 3.98 0.251494lnFarmInc | 3.72 0.268734lnExpProd | 3.69 0.270983lnImpGDP | 2.37 0.421065lnDistance | 1.96 0.511224ASEANm | 1.76 0.567868NAFTAmn | 1.56 0.640402NAFTAmm | 1.25 0.802299MERCOSURmn | 1.13 0.881311-------------+----------------------Mean VIF | 8.21
Now I removed the variables above just to see what will happen (I have to say all these variables are really important here, hence I do not want to remove them.) the regression output seem ok. I also did the correlation matrix and it gives the output below.
I would like to know if there is a way to solve this issue without having to remove any of my variables?
Could you please tell me if there is another regression diagnostic I should do instead of focusing only on high collinearity? Thanks for you help.
Code:
running regression without EXpPOP and EXPGDP . reg lnStateExports NAFTAmm NAFTAmn lnImpGDP lnImpPop lnExpProd lnImpProd lnFarmInc lnDistance ASEAN > m MERCOSURmn, robust Linear regression Number of obs = 46477 F( 10, 46466) = 1005.83 Prob > F = 0.0000 R-squared = 0.1950 Root MSE = 2.042 ------------------------------------------------------------------------------ | Robust lnStateExp~s | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- NAFTAmm | -1.5044 .1081805 -13.91 0.000 -1.716436 -1.292365 NAFTAmn | -.3608108 .048148 -7.49 0.000 -.4551817 -.26644 lnImpGDP | .4777262 .0088465 54.00 0.000 .4603869 .4950655 lnImpPop | .0328538 .014152 2.32 0.020 .0051156 .060592 ...... vif Variable | VIF 1/VIF -------------+---------------------- lnImpProd | 4.53 0.220731 lnImpPop | 3.98 0.251540 lnExpProd | 3.33 0.300253 lnFarmInc | 3.01 0.331741 lnImpGDP | 2.37 0.422257 lnDistance | 1.95 0.513197 ASEANm | 1.76 0.568594 NAFTAmn | 1.45 0.688263 NAFTAmm | 1.25 0.802492 MERCOSURmn | 1.13 0.881512 -------------+---------------------- Mean VIF | 2.48
Code:
. pwcorr lnStateExports NAFTAmm NAFTAmn lnExpGDP lnImpGDP lnExpPop lnImpPop lnExpProd lnImpProd lnFar > mInc lnDistance ASEANm MERCOSURmn | lnStat.. NAFTAmm NAFTAmn lnExpGDP lnImpGDP lnExpPop lnImpPop -------------+--------------------------------------------------------------- lnStateExp~s | 1.0000 NAFTAmm | 0.0029 1.0000 NAFTAmn | -0.0431 0.0497 1.0000 lnExpGDP | 0.2308 0.0097 0.1563 1.0000 lnImpGDP | -0.0581 -0.0556 -0.3479 -0.0943 1.0000 lnExpPop | 0.1712 -0.0143 -0.1683 0.4509 -0.0825 1.0000 lnImpPop | 0.1491 0.1121 -0.0085 -0.0397 0.5758 -0.0197 1.0000 lnExpProd | 0.1366 -0.0053 0.1189 0.5315 0.2612 0.1680 -0.0250 lnImpProd | -0.1147 -0.0573 -0.2563 -0.0496 0.9835 -0.0531 0.7958 lnFarmInc | 0.2375 -0.0009 -0.0101 0.4262 -0.0502 0.2373 -0.0137 lnDistance | 0.0508 -0.2313 0.0126 -0.0374 0.0361 -0.0254 0.3061 ASEANm | 0.0590 -0.0601 -0.0097 -0.0306 0.1801 -0.0218 0.1661 MERCOSURmn | -0.0604 -0.0638 0.0002 0.0234 0.0070 0.0165 -0.0144 | lnExpP~d lnImpP~d lnFarm~c lnDist~e ASEANm MERCOS~n -------------+------------------------------------------------------ lnExpProd | 1.0000 lnImpProd | 0.3098 1.0000 lnFarmInc | 0.7343 -0.0186 1.0000 lnDistance | 0.0124 -0.0258 0.0450 1.0000 ASEANm | 0.0059 0.0586 -0.0119 0.3608 1.0000 MERCOSURmn | 0.0094 0.0217 0.0126 -0.1854 -0.1162 1.0000
Comment