Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • different results using stata14 and stata15

    Hi all
    I'm running a pool OLS regression with the command as follows.
    xi: reg lnnumber_d_w post soe postsoe ownership asset_w leverage_w cashhold_w i.indu1 i.year1,cluster(stkcd)
    When I run the same do-file with same data in STATA14 and STATA15, the result for variable “post"(a dummy variable coding 1 after year 2013) is different while other variables are same.
    Morever, _Iyear1_2014 is omitted because of collinearity in STATA 14 while _Iyear1_2015 is omitted because of collinearity in STATA 15.
    I just wonder why the results are different in STATA14 and 15. Which result is correct?


    Below is the result using STATA14.

    i.indu1 _Iindu1_1-17 (_Iindu1_1 for indu1==A omitted)
    i.year1 _Iyear1_2009-2015 (naturally coded; _Iyear1_2009 omitted)
    note: _Iyear1_2014 omitted because of collinearity

    Linear regression Number of obs = 9,217
    F(37, 2110) = 9.43
    Prob > F = 0.0000
    R-squared = 0.0531
    Root MSE = .29529

    (Std. Err. adjusted for 2,111 clusters in stkcd)
    -------------------------------------------------------------------------------
    | Robust
    lnnumber_d_w | Coef. Std. Err. t P>|t| [95% Conf. Interval]
    --------------+----------------------------------------------------------------
    post | .0668674 .013085 5.11 0.000 .0412067 .0925282
    soe | -.0110271 .0093917 -1.17 0.240 -.0294451 .0073909
    postsoe | -.0894927 .0123459 -7.25 0.000 -.1137042 -.0652813
    ownership | -.0013779 .0211927 -0.07 0.948 -.0429386 .0401829
    asset_w | .006405 .0035147 1.82 0.069 -.0004877 .0132978
    leverage_w | .0308835 .0220983 1.40 0.162 -.0124533 .0742203
    cashhold_w | .0648878 .0279293 2.32 0.020 .0101159 .1196596
    _Iindu1_2 | .0836318 .0283376 2.95 0.003 .0280593 .1392043
    _Iindu1_3 | .0384154 .0183909 2.09 0.037 .0023491 .0744817
    _Iindu1_4 | .1246017 .0270419 4.61 0.000 .07157 .1776333
    _Iindu1_5 | .0585011 .0269362 2.17 0.030 .0056767 .1113255
    _Iindu1_6 | .041658 .0221213 1.88 0.060 -.0017239 .0850399
    _Iindu1_7 | .0586802 .0252648 2.32 0.020 .0091338 .1082267
    _Iindu1_8 | .0196253 .0339283 0.58 0.563 -.0469111 .0861616
    _Iindu1_9 | .0792681 .0267385 2.96 0.003 .0268317 .1317046
    _Iindu1_10 | .0433499 .0244094 1.78 0.076 -.0045192 .091219
    _Iindu1_11 | .0640976 .0376795 1.70 0.089 -.0097953 .1379904
    _Iindu1_12 | .0694245 .0631215 1.10 0.272 -.0543623 .1932113
    _Iindu1_13 | .064119 .0429139 1.49 0.135 -.020039 .148277
    _Iindu1_14 | .0542156 .1049017 0.52 0.605 -.151506 .2599372
    _Iindu1_15 | .0503639 .0456947 1.10 0.271 -.0392474 .1399752
    _Iindu1_16 | -.0219435 .0351372 -0.62 0.532 -.0908506 .0469636
    _Iindu1_17 | .087508 .0390906 2.24 0.025 .0108478 .1641682
    _Iyear1_2010 | .0163025 .0112808 1.45 0.149 -.0058202 .0384251
    _Iyear1_2011 | .0006867 .0113649 0.06 0.952 -.0216008 .0229743
    _Iyear1_2012 | .0056306 .0108877 0.52 0.605 -.0157211 .0269823
    _Iyear1_2014 | 0 (omitted)
    _Iyear1_2015 | .0566508 .0103356 5.48 0.000 .0363817 .0769199
    _cons | .0145059 .0758323 0.19 0.848 -.1342079 .1632198
    -------------------------------------------------------------------------------

    Here is the result using STATA15.

    i.indu1 _Iindu1_1-17 (_Iindu1_1 for indu1==A omitted)
    i.year1 _Iyear1_2009-2015 (naturally coded; _Iyear1_2009 omitted)
    note: _Iyear1_2015 omitted because of collinearity

    Linear regression Number of obs = 9,217
    F(37, 2110) = 9.78
    Prob > F = 0.0000
    R-squared = 0.0547
    Root MSE = .29822

    (Std. Err. adjusted for 2,111 clusters in stkcd)
    ------------------------------------------------------------------------------
    | Robust
    lnnumber_w | Coef. Std. Err. t P>|t| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    post | .1268291 .0145373 8.72 0.000 .0983201 .1553382
    soe | -.0130607 .009503 -1.37 0.169 -.0316969 .0055755
    postsoe | -.092008 .0125001 -7.36 0.000 -.1165218 -.0674942
    ownership | .0006562 .0214245 0.03 0.976 -.0413591 .0426715
    asset_w | .0072593 .0035555 2.04 0.041 .0002866 .0142319
    leverage_w | .0291227 .0222048 1.31 0.190 -.0144228 .0726682
    cashhold_w | .0629849 .028271 2.23 0.026 .0075428 .1184269
    _Iindu1_2 | .0877501 .029679 2.96 0.003 .029547 .1459532
    _Iindu1_3 | .0369433 .0198711 1.86 0.063 -.0020256 .0759122
    _Iindu1_4 | .121166 .0280556 4.32 0.000 .0661465 .1761856
    _Iindu1_5 | .0596673 .0284525 2.10 0.036 .0038694 .1154652
    _Iindu1_6 | .0381356 .0234042 1.63 0.103 -.0077621 .0840333
    _Iindu1_7 | .0564075 .0265541 2.12 0.034 .0043325 .1084824
    _Iindu1_8 | .0167578 .0344558 0.49 0.627 -.0508132 .0843287
    _Iindu1_9 | .075899 .028062 2.70 0.007 .020867 .1309311
    _Iindu1_10 | .0395524 .0254455 1.55 0.120 -.0103484 .0894532
    _Iindu1_11 | .0604579 .0384397 1.57 0.116 -.0149258 .1358415
    _Iindu1_12 | .0650782 .0635301 1.02 0.306 -.05951 .1896665
    _Iindu1_13 | .0599678 .0435654 1.38 0.169 -.0254679 .1454035
    _Iindu1_14 | .0511799 .1055605 0.48 0.628 -.1558336 .2581934
    _Iindu1_15 | .0446924 .0459836 0.97 0.331 -.0454855 .1348703
    _Iindu1_16 | -.0268586 .0359398 -0.75 0.455 -.0973397 .0436225
    _Iindu1_17 | .083435 .0398149 2.10 0.036 .0053545 .1615155
    _Iyear1_2010 | .0162149 .0112807 1.44 0.151 -.0059076 .0383374
    _Iyear1_2011 | .0019031 .0114849 0.17 0.868 -.0206197 .024426
    _Iyear1_2012 | .0066134 .0109773 0.60 0.547 -.014914 .0281409
    _Iyear1_2014 | -.0564731 .010469 -5.39 0.000 -.0770038 -.0359424
    _Iyear1_2015 | 0 (omitted)
    _cons | .0123937 .0771134 0.16 0.872 -.1388326 .1636199
    ------------------------------------------------------------------------------
    Last edited by Bailu Liu; 26 May 2020, 19:43.

  • #2
    Which result is correct?
    I wouldn't use either results until the cause for the difference is identified.
    _Iyear1_2014 is omitted because of collinearity in STATA 14 while _Iyear1_2015 is omitted because of collinearity in STATA 15.
    You can enforce a certain category to be dropped in such cases. Follow the note "Summary of controlling the omitted dummy" in help for xi.
    Code:
    char year[omit] 2014
    the result for variable “post"... is different while other variables are same.
    No, they are different for every variable. Stata's precision is not in the 3rd digit after comma, but (for regress at least) is much deeper. So there is something else affecting the differences.
    1. Re-run a simpler model (e.g. without xi at all, just the slopes, no dummies). Make sure results are same.
    2. Add one set of dummies. Check.
    3. If the problem with the xi, try to generate the dummies first with a separate xi command, then specify directly which dummies to include (you don't have an astronomical amount of them).
    4. Check if the results are stable within each version (14 or 15) as well. If they wobble too, then it is likely not the version of Stata that matters, but your preparatory code. Make sure you are running your code on the same sample. Do this by saving immediately before regression in v14, and running a regression in v15 directly on that saved file, not by having 10 pages of the preparatory code, which you think should result in the same sample.

    Best, Sergiy

    Comment

    Working...
    X