different results using stata14 and stata15

Bailu Liu

Join Date: May 2020

Posts: 2
#1

different results using stata14 and stata15

26 May 2020, 19:31

Hi all
I'm running a pool OLS regression with the command as follows.
xi: reg lnnumber_d_w post soe postsoe ownership asset_w leverage_w cashhold_w i.indu1 i.year1,cluster(stkcd)
When I run the same do-file with same data in STATA14 and STATA15, the result for variable “post"(a dummy variable coding 1 after year 2013) is different while other variables are same.
Morever, _Iyear1_2014 is omitted because of collinearity in STATA 14 while _Iyear1_2015 is omitted because of collinearity in STATA 15.
I just wonder why the results are different in STATA14 and 15. Which result is correct?

Below is the result using STATA14.

i.indu1 _Iindu1_1-17 (_Iindu1_1 for indu1==A omitted)
i.year1 _Iyear1_2009-2015 (naturally coded; _Iyear1_2009 omitted)
note: _Iyear1_2014 omitted because of collinearity

Linear regression Number of obs = 9,217
F(37, 2110) = 9.43
Prob > F = 0.0000
R-squared = 0.0531
Root MSE = .29529

(Std. Err. adjusted for 2,111 clusters in stkcd)
-------------------------------------------------------------------------------
| Robust
lnnumber_d_w | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
post | .0668674 .013085 5.11 0.000 .0412067 .0925282
soe | -.0110271 .0093917 -1.17 0.240 -.0294451 .0073909
postsoe | -.0894927 .0123459 -7.25 0.000 -.1137042 -.0652813
ownership | -.0013779 .0211927 -0.07 0.948 -.0429386 .0401829
asset_w | .006405 .0035147 1.82 0.069 -.0004877 .0132978
leverage_w | .0308835 .0220983 1.40 0.162 -.0124533 .0742203
cashhold_w | .0648878 .0279293 2.32 0.020 .0101159 .1196596
_Iindu1_2 | .0836318 .0283376 2.95 0.003 .0280593 .1392043
_Iindu1_3 | .0384154 .0183909 2.09 0.037 .0023491 .0744817
_Iindu1_4 | .1246017 .0270419 4.61 0.000 .07157 .1776333
_Iindu1_5 | .0585011 .0269362 2.17 0.030 .0056767 .1113255
_Iindu1_6 | .041658 .0221213 1.88 0.060 -.0017239 .0850399
_Iindu1_7 | .0586802 .0252648 2.32 0.020 .0091338 .1082267
_Iindu1_8 | .0196253 .0339283 0.58 0.563 -.0469111 .0861616
_Iindu1_9 | .0792681 .0267385 2.96 0.003 .0268317 .1317046
_Iindu1_10 | .0433499 .0244094 1.78 0.076 -.0045192 .091219
_Iindu1_11 | .0640976 .0376795 1.70 0.089 -.0097953 .1379904
_Iindu1_12 | .0694245 .0631215 1.10 0.272 -.0543623 .1932113
_Iindu1_13 | .064119 .0429139 1.49 0.135 -.020039 .148277
_Iindu1_14 | .0542156 .1049017 0.52 0.605 -.151506 .2599372
_Iindu1_15 | .0503639 .0456947 1.10 0.271 -.0392474 .1399752
_Iindu1_16 | -.0219435 .0351372 -0.62 0.532 -.0908506 .0469636
_Iindu1_17 | .087508 .0390906 2.24 0.025 .0108478 .1641682
_Iyear1_2010 | .0163025 .0112808 1.45 0.149 -.0058202 .0384251
_Iyear1_2011 | .0006867 .0113649 0.06 0.952 -.0216008 .0229743
_Iyear1_2012 | .0056306 .0108877 0.52 0.605 -.0157211 .0269823
_Iyear1_2014 | 0 (omitted)
_Iyear1_2015 | .0566508 .0103356 5.48 0.000 .0363817 .0769199
_cons | .0145059 .0758323 0.19 0.848 -.1342079 .1632198
-------------------------------------------------------------------------------

Here is the result using STATA15.

i.indu1 _Iindu1_1-17 (_Iindu1_1 for indu1==A omitted)
i.year1 _Iyear1_2009-2015 (naturally coded; _Iyear1_2009 omitted)
note: _Iyear1_2015 omitted because of collinearity

Linear regression Number of obs = 9,217
F(37, 2110) = 9.78
Prob > F = 0.0000
R-squared = 0.0547
Root MSE = .29822

(Std. Err. adjusted for 2,111 clusters in stkcd)
------------------------------------------------------------------------------
| Robust
lnnumber_w | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
post | .1268291 .0145373 8.72 0.000 .0983201 .1553382
soe | -.0130607 .009503 -1.37 0.169 -.0316969 .0055755
postsoe | -.092008 .0125001 -7.36 0.000 -.1165218 -.0674942
ownership | .0006562 .0214245 0.03 0.976 -.0413591 .0426715
asset_w | .0072593 .0035555 2.04 0.041 .0002866 .0142319
leverage_w | .0291227 .0222048 1.31 0.190 -.0144228 .0726682
cashhold_w | .0629849 .028271 2.23 0.026 .0075428 .1184269
_Iindu1_2 | .0877501 .029679 2.96 0.003 .029547 .1459532
_Iindu1_3 | .0369433 .0198711 1.86 0.063 -.0020256 .0759122
_Iindu1_4 | .121166 .0280556 4.32 0.000 .0661465 .1761856
_Iindu1_5 | .0596673 .0284525 2.10 0.036 .0038694 .1154652
_Iindu1_6 | .0381356 .0234042 1.63 0.103 -.0077621 .0840333
_Iindu1_7 | .0564075 .0265541 2.12 0.034 .0043325 .1084824
_Iindu1_8 | .0167578 .0344558 0.49 0.627 -.0508132 .0843287
_Iindu1_9 | .075899 .028062 2.70 0.007 .020867 .1309311
_Iindu1_10 | .0395524 .0254455 1.55 0.120 -.0103484 .0894532
_Iindu1_11 | .0604579 .0384397 1.57 0.116 -.0149258 .1358415
_Iindu1_12 | .0650782 .0635301 1.02 0.306 -.05951 .1896665
_Iindu1_13 | .0599678 .0435654 1.38 0.169 -.0254679 .1454035
_Iindu1_14 | .0511799 .1055605 0.48 0.628 -.1558336 .2581934
_Iindu1_15 | .0446924 .0459836 0.97 0.331 -.0454855 .1348703
_Iindu1_16 | -.0268586 .0359398 -0.75 0.455 -.0973397 .0436225
_Iindu1_17 | .083435 .0398149 2.10 0.036 .0053545 .1615155
_Iyear1_2010 | .0162149 .0112807 1.44 0.151 -.0059076 .0383374
_Iyear1_2011 | .0019031 .0114849 0.17 0.868 -.0206197 .024426
_Iyear1_2012 | .0066134 .0109773 0.60 0.547 -.014914 .0281409
_Iyear1_2014 | -.0564731 .010469 -5.39 0.000 -.0770038 -.0359424
_Iyear1_2015 | 0 (omitted)
_cons | .0123937 .0771134 0.16 0.872 -.1388326 .1636199
------------------------------------------------------------------------------

Last edited by Bailu Liu; 26 May 2020, 19:43.
Tags: None
Sergiy Radyakin

Join Date: Apr 2014

Posts: 1867
#2

26 May 2020, 20:42

Which result is correct?

I wouldn't use either results until the cause for the difference is identified.

_Iyear1_2014 is omitted because of collinearity in STATA 14 while _Iyear1_2015 is omitted because of collinearity in STATA 15.

You can enforce a certain category to be dropped in such cases. Follow the note "Summary of controlling the omitted dummy" in help for xi.

Code:

char year[omit] 2014

the result for variable “post"... is different while other variables are same.

No, they are different for every variable. Stata's precision is not in the 3rd digit after comma, but (for regress at least) is much deeper. So there is something else affecting the differences.
1. Re-run a simpler model (e.g. without xi at all, just the slopes, no dummies). Make sure results are same.
2. Add one set of dummies. Check.
3. If the problem with the xi, try to generate the dummies first with a separate xi command, then specify directly which dummies to include (you don't have an astronomical amount of them).
4. Check if the results are stable within each version (14 or 15) as well. If they wobble too, then it is likely not the version of Stata that matters, but your preparatory code. Make sure you are running your code on the same sample. Do this by saving immediately before regression in v14, and running a regression in v15 directly on that saved file, not by having 10 pages of the preparatory code, which you think should result in the same sample.

Best, Sergiy
Comment

Announcement

different results using stata14 and stata15

Comment