Dear experts,
I have a panel dataset of 77 variables and approximately 57.000 observations for the years 2014 - 2018. Therefore I use dummy variables for the independent variable company size (klein mittel groß) and industry sector (LuF BB, etc.). Using this, I ran regress to determine the effect on the tax burden (ETR_un) of companies.
I am using xtreg in Stata 15.1.
My problem is that as soon as I add the company size to my regression in addition to the industry dummies, 2 variables are immediately omitted. Therefore, the values of the independent variables are skewed.
I know that to avoid a dummy trap, I can remove one variable from the industry dummies and one from the company size, but the values still remain skewed.
How can I get around this problem?
In the following you can see that the respective average tax rates of the industries and company sizes are not the same as in the output of the regression.
Lastly, I wanted to ask whether I am correct with the REM regression? In the FEM, it showed me "omitted" for all industry dummies:
Many thanks.
Kind regards
Can
I have a panel dataset of 77 variables and approximately 57.000 observations for the years 2014 - 2018. Therefore I use dummy variables for the independent variable company size (klein mittel groß) and industry sector (LuF BB, etc.). Using this, I ran regress to determine the effect on the tax burden (ETR_un) of companies.
I am using xtreg in Stata 15.1.
My problem is that as soon as I add the company size to my regression in addition to the industry dummies, 2 variables are immediately omitted. Therefore, the values of the independent variables are skewed.
I know that to avoid a dummy trap, I can remove one variable from the industry dummies and one from the company size, but the values still remain skewed.
How can I get around this problem?
Code:
xtreg ETR_un LuF BB Verarbeitendes Energieversorg Wasserversorg Baugewerbe Handel Verkehr Gastgewerbe Inform_Kommun Finanz_Versich Grunds > tücks_Wohnungswesen FreiWissTech_DL wirts_DL ÖV Erziehung_Unterr Gesundheit_Sozialwesen Kunst_Unterhaltung_Erholung sonst_DL klein mittel > groß i.year, re note: sonst_DL omitted because of collinearity note: groß omitted because of collinearity Random-effects GLS regression Number of obs = 57,217 Group variable: ID Number of groups = 18,389 R-sq: Obs per group: within = 0.0013 min = 1 between = 0.0589 avg = 3.1 overall = 0.0337 max = 5 Wald chi2(24) = 1163.20 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 --------------------------------------------------------------------------------------------- ETR_un | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------------------+---------------------------------------------------------------- LuF | -2.806153 1.724013 -1.63 0.104 -6.185157 .5728511 BB | 1.223227 1.931906 0.63 0.527 -2.563238 5.009692 Verarbeitendes | 1.156536 .7854209 1.47 0.141 -.3828609 2.695932 Energieversorg | -1.362574 .8817548 -1.55 0.122 -3.090782 .3656336 Wasserversorg | 1.335969 1.015181 1.32 0.188 -.6537506 3.325688 Baugewerbe | .8637391 .8727683 0.99 0.322 -.8468553 2.574333 Handel | 2.5564 .789308 3.24 0.001 1.009385 4.103415 Verkehr | 1.4275 .8937224 1.60 0.110 -.3241637 3.179164 Gastgewerbe | 2.483374 1.282375 1.94 0.053 -.0300348 4.996784 Inform_Kommun | 2.36475 .8876878 2.66 0.008 .6249134 4.104586 Finanz_Versich | 3.360829 .911962 3.69 0.000 1.573416 5.148241 Grundstücks_Wohnungswesen | -6.140547 .9226302 -6.66 0.000 -7.948869 -4.332225 FreiWissTech_DL | 1.915703 .80875 2.37 0.018 .330582 3.500824 wirts_DL | 2.66731 .880347 3.03 0.002 .9418616 4.392758 ÖV | 6.128692 2.154682 2.84 0.004 1.905592 10.35179 Erziehung_Unterr | -7.485594 1.566971 -4.78 0.000 -10.5568 -4.414388 Gesundheit_Sozialwesen | -11.52747 .8984018 -12.83 0.000 -13.28831 -9.766636 Kunst_Unterhaltung_Erholung | 1.190228 1.34611 0.88 0.377 -1.4481 3.828556 sonst_DL | 0 (omitted) klein | .107322 .3644027 0.29 0.768 -.6068941 .8215381 mittel | -.4489496 .2944976 -1.52 0.127 -1.026154 .1282552 groß | 0 (omitted) | year | 15 | .3697264 .1749918 2.11 0.035 .0267489 .712704 16 | -.2524269 .1744287 -1.45 0.148 -.5943008 .0894471 17 | .3338847 .1742833 1.92 0.055 -.0077044 .6754738 18 | .9312448 .2926599 3.18 0.001 .3576419 1.504848 | _cons | 26.40237 .775345 34.05 0.000 24.88272 27.92202 ----------------------------+---------------------------------------------------------------- sigma_u | 9.70098 sigma_e | 12.63551 rho | .37085086 (fraction of variance due to u_i)
Code:
tabstat ETR_un, statistics (count mean sd max min range) by(Branche) Summary for variables: ETR_un by categories of: Branche (Branche) Branche | N mean sd max min range -----------------+------------------------------------------------------------ 1. Land- und For | 177 24.83884 12.81619 76.21348 1.072381 75.1411 2. Bergbau und G | 142 28.01898 17.38571 91.96083 1.116526 90.8443 3. Verarbeitende | 16119 28.02748 14.02538 99.6544 1.005321 98.64908 4. Energieversor | 2514 25.49997 15.52516 97.77159 1.019462 96.75213 5. Wasserversorg | 1067 27.8953 15.64467 99.73144 1.119681 98.61176 6. Baugewerbe/Ba | 2725 27.80223 12.44849 98.92137 1.014662 97.90671 7. Handel; Insta | 13455 29.22813 13.67265 99.76919 1.003844 98.76534 8. Verkehr und L | 2173 28.26321 14.86916 99.54535 1.024184 98.52117 9. Gastgewerbe/B | 417 29.41986 15.65624 99.04601 1.067991 97.97802 10. Information | 2270 29.26193 15.2746 97.74427 1.017193 96.72708 11. Erbringung v | 1842 30.01445 18.04395 99.85857 1.026219 98.83235 12. Grundstücks- | 1679 20.97251 16.91162 99.36201 1.012189 98.34982 13. Erbringung v | 6944 28.79622 17.03924 99.88694 1.017734 98.86921 14. Erbringung v | 2441 29.4939 15.703 99.85537 1.02731 98.82806 15. Öffentliche | 108 33.55072 27.06546 98.77544 1.449751 97.32569 16. Erziehung un | 206 20.88952 22.37715 99.39492 1.019612 98.37531 17. Gesundheits- | 1822 15.34987 16.67667 99.662 1.000133 98.66187 18. Kunst, Unter | 366 27.51446 19.84288 97.59387 1.18329 96.41058 19. Erbringung v | 750 27.34199 17.72626 98.51981 1.002463 97.51734 -----------------+------------------------------------------------------------ Total | 57217 27.82532 15.28509 99.88694 1.000133 98.88681 ------------------------------------------------------------------------------
Code:
tabstat ETR_un, statistics (count mean sd max min range) by(Größe_HP) Summary for variables: ETR_un by categories of: Größe_HP Größe_HP | N mean sd max min range --------------+------------------------------------------------------------ große KapG | 45881 27.75923 15.2651 99.88694 1.001677 98.88526 kleine KapG | 3250 28.64398 15.4276 99.27302 1.003844 98.26917 mittlere KapG | 8086 27.87132 15.33302 99.73144 1.000133 98.7313 --------------+------------------------------------------------------------ Total | 57217 27.82532 15.28509 99.88694 1.000133 98.88681 ---------------------------------------------------------------------------
Code:
xtreg ETR_un LuF BB Verarbeitendes Energieversorg Wasserversorg Baugewerbe Handel Verkehr Gastgewerbe Inform_Kommun Finanz_Versich Grunds > tücks_Wohnungswesen FreiWissTech_DL wirts_DL ÖV Erziehung_Unterr Gesundheit_Sozialwesen Kunst_Unterhaltung_Erholung sonst_DL klein mittel > , fe note: LuF omitted because of collinearity note: BB omitted because of collinearity note: Verarbeitendes omitted because of collinearity note: Energieversorg omitted because of collinearity note: Wasserversorg omitted because of collinearity note: Baugewerbe omitted because of collinearity note: Handel omitted because of collinearity note: Verkehr omitted because of collinearity note: Gastgewerbe omitted because of collinearity note: Inform_Kommun omitted because of collinearity note: Finanz_Versich omitted because of collinearity note: Grundstücks_Wohnungswesen omitted because of collinearity note: FreiWissTech_DL omitted because of collinearity note: wirts_DL omitted because of collinearity note: ÖV omitted because of collinearity note: Erziehung_Unterr omitted because of collinearity note: Gesundheit_Sozialwesen omitted because of collinearity note: Kunst_Unterhaltung_Erholung omitted because of collinearity note: sonst_DL omitted because of collinearity Fixed-effects (within) regression Number of obs = 57,217 Group variable: ID Number of groups = 18,389 R-sq: Obs per group: within = 0.0007 min = 1 between = 0.0002 avg = 3.1 overall = 0.0001 max = 5 F(2,38826) = 13.26 corr(u_i, Xb) = -0.0121 Prob > F = 0.0000 --------------------------------------------------------------------------------------------- ETR_un | Coef. Std. Err. t P>|t| [95% Conf. Interval] ----------------------------+---------------------------------------------------------------- LuF | 0 (omitted) BB | 0 (omitted) Verarbeitendes | 0 (omitted) Energieversorg | 0 (omitted) Wasserversorg | 0 (omitted) Baugewerbe | 0 (omitted) Handel | 0 (omitted) Verkehr | 0 (omitted) Gastgewerbe | 0 (omitted) Inform_Kommun | 0 (omitted) Finanz_Versich | 0 (omitted) Grundstücks_Wohnungswesen | 0 (omitted) FreiWissTech_DL | 0 (omitted) wirts_DL | 0 (omitted) ÖV | 0 (omitted) Erziehung_Unterr | 0 (omitted) Gesundheit_Sozialwesen | 0 (omitted) Kunst_Unterhaltung_Erholung | 0 (omitted) sonst_DL | 0 (omitted) klein | 1.206493 .2799578 4.31 0.000 .6577689 1.755217 mittel | .5408651 .1822532 2.97 0.003 .1836443 .898086 _cons | 27.68036 .0611299 452.81 0.000 27.56054 27.80017 ----------------------------+---------------------------------------------------------------- sigma_u | 13.079854 sigma_e | 12.638967 rho | .51713756 (fraction of variance due to u_i) --------------------------------------------------------------------------------------------- F test that all u_i=0: F(18388, 38826) = 2.44 Prob > F = 0.0000
Kind regards
Can
Comment