Dear experts,
I have a panel dataset of 77 variables and approximately 57.000 observations for the years 2014 - 2018. Therefore I use dummy variables for the independent variable company size (klein mittel groß) and industry sector (LuF BB, etc.). Using this, I ran regress to determine the effect on the tax burden (ETR_un) of companies.
I am using xtreg in Stata 15.1.
My problem is that as soon as I add the company size to my regression in addition to the industry dummies, 2 variables are immediately omitted. Therefore, the values of the independent variables are skewed.
I know that to avoid a dummy trap, I can remove one variable from the industry dummies and one from the company size, but the values still remain skewed.
How can I get around this problem?
In the following you can see that the respective average tax rates of the industries and company sizes are not the same as in the output of the regression.
Lastly, I wanted to ask whether I am correct with the REM regression? In the FEM, it showed me "omitted" for all industry dummies:
Many thanks.
Kind regards
Can
I have a panel dataset of 77 variables and approximately 57.000 observations for the years 2014 - 2018. Therefore I use dummy variables for the independent variable company size (klein mittel groß) and industry sector (LuF BB, etc.). Using this, I ran regress to determine the effect on the tax burden (ETR_un) of companies.
I am using xtreg in Stata 15.1.
My problem is that as soon as I add the company size to my regression in addition to the industry dummies, 2 variables are immediately omitted. Therefore, the values of the independent variables are skewed.
I know that to avoid a dummy trap, I can remove one variable from the industry dummies and one from the company size, but the values still remain skewed.
How can I get around this problem?
Code:
xtreg ETR_un LuF BB Verarbeitendes Energieversorg Wasserversorg Baugewerbe Handel Verkehr Gastgewerbe Inform_Kommun Finanz_Versich Grunds
> tücks_Wohnungswesen FreiWissTech_DL wirts_DL ÖV Erziehung_Unterr Gesundheit_Sozialwesen Kunst_Unterhaltung_Erholung sonst_DL klein mittel
> groß i.year, re
note: sonst_DL omitted because of collinearity
note: groß omitted because of collinearity
Random-effects GLS regression Number of obs = 57,217
Group variable: ID Number of groups = 18,389
R-sq: Obs per group:
within = 0.0013 min = 1
between = 0.0589 avg = 3.1
overall = 0.0337 max = 5
Wald chi2(24) = 1163.20
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000
---------------------------------------------------------------------------------------------
ETR_un | Coef. Std. Err. z P>|z| [95% Conf. Interval]
----------------------------+----------------------------------------------------------------
LuF | -2.806153 1.724013 -1.63 0.104 -6.185157 .5728511
BB | 1.223227 1.931906 0.63 0.527 -2.563238 5.009692
Verarbeitendes | 1.156536 .7854209 1.47 0.141 -.3828609 2.695932
Energieversorg | -1.362574 .8817548 -1.55 0.122 -3.090782 .3656336
Wasserversorg | 1.335969 1.015181 1.32 0.188 -.6537506 3.325688
Baugewerbe | .8637391 .8727683 0.99 0.322 -.8468553 2.574333
Handel | 2.5564 .789308 3.24 0.001 1.009385 4.103415
Verkehr | 1.4275 .8937224 1.60 0.110 -.3241637 3.179164
Gastgewerbe | 2.483374 1.282375 1.94 0.053 -.0300348 4.996784
Inform_Kommun | 2.36475 .8876878 2.66 0.008 .6249134 4.104586
Finanz_Versich | 3.360829 .911962 3.69 0.000 1.573416 5.148241
Grundstücks_Wohnungswesen | -6.140547 .9226302 -6.66 0.000 -7.948869 -4.332225
FreiWissTech_DL | 1.915703 .80875 2.37 0.018 .330582 3.500824
wirts_DL | 2.66731 .880347 3.03 0.002 .9418616 4.392758
ÖV | 6.128692 2.154682 2.84 0.004 1.905592 10.35179
Erziehung_Unterr | -7.485594 1.566971 -4.78 0.000 -10.5568 -4.414388
Gesundheit_Sozialwesen | -11.52747 .8984018 -12.83 0.000 -13.28831 -9.766636
Kunst_Unterhaltung_Erholung | 1.190228 1.34611 0.88 0.377 -1.4481 3.828556
sonst_DL | 0 (omitted)
klein | .107322 .3644027 0.29 0.768 -.6068941 .8215381
mittel | -.4489496 .2944976 -1.52 0.127 -1.026154 .1282552
groß | 0 (omitted)
|
year |
15 | .3697264 .1749918 2.11 0.035 .0267489 .712704
16 | -.2524269 .1744287 -1.45 0.148 -.5943008 .0894471
17 | .3338847 .1742833 1.92 0.055 -.0077044 .6754738
18 | .9312448 .2926599 3.18 0.001 .3576419 1.504848
|
_cons | 26.40237 .775345 34.05 0.000 24.88272 27.92202
----------------------------+----------------------------------------------------------------
sigma_u | 9.70098
sigma_e | 12.63551
rho | .37085086 (fraction of variance due to u_i)
Code:
tabstat ETR_un, statistics (count mean sd max min range) by(Branche)
Summary for variables: ETR_un
by categories of: Branche (Branche)
Branche | N mean sd max min range
-----------------+------------------------------------------------------------
1. Land- und For | 177 24.83884 12.81619 76.21348 1.072381 75.1411
2. Bergbau und G | 142 28.01898 17.38571 91.96083 1.116526 90.8443
3. Verarbeitende | 16119 28.02748 14.02538 99.6544 1.005321 98.64908
4. Energieversor | 2514 25.49997 15.52516 97.77159 1.019462 96.75213
5. Wasserversorg | 1067 27.8953 15.64467 99.73144 1.119681 98.61176
6. Baugewerbe/Ba | 2725 27.80223 12.44849 98.92137 1.014662 97.90671
7. Handel; Insta | 13455 29.22813 13.67265 99.76919 1.003844 98.76534
8. Verkehr und L | 2173 28.26321 14.86916 99.54535 1.024184 98.52117
9. Gastgewerbe/B | 417 29.41986 15.65624 99.04601 1.067991 97.97802
10. Information | 2270 29.26193 15.2746 97.74427 1.017193 96.72708
11. Erbringung v | 1842 30.01445 18.04395 99.85857 1.026219 98.83235
12. Grundstücks- | 1679 20.97251 16.91162 99.36201 1.012189 98.34982
13. Erbringung v | 6944 28.79622 17.03924 99.88694 1.017734 98.86921
14. Erbringung v | 2441 29.4939 15.703 99.85537 1.02731 98.82806
15. Öffentliche | 108 33.55072 27.06546 98.77544 1.449751 97.32569
16. Erziehung un | 206 20.88952 22.37715 99.39492 1.019612 98.37531
17. Gesundheits- | 1822 15.34987 16.67667 99.662 1.000133 98.66187
18. Kunst, Unter | 366 27.51446 19.84288 97.59387 1.18329 96.41058
19. Erbringung v | 750 27.34199 17.72626 98.51981 1.002463 97.51734
-----------------+------------------------------------------------------------
Total | 57217 27.82532 15.28509 99.88694 1.000133 98.88681
------------------------------------------------------------------------------
Code:
tabstat ETR_un, statistics (count mean sd max min range) by(Größe_HP)
Summary for variables: ETR_un
by categories of: Größe_HP
Größe_HP | N mean sd max min range
--------------+------------------------------------------------------------
große KapG | 45881 27.75923 15.2651 99.88694 1.001677 98.88526
kleine KapG | 3250 28.64398 15.4276 99.27302 1.003844 98.26917
mittlere KapG | 8086 27.87132 15.33302 99.73144 1.000133 98.7313
--------------+------------------------------------------------------------
Total | 57217 27.82532 15.28509 99.88694 1.000133 98.88681
---------------------------------------------------------------------------
Code:
xtreg ETR_un LuF BB Verarbeitendes Energieversorg Wasserversorg Baugewerbe Handel Verkehr Gastgewerbe Inform_Kommun Finanz_Versich Grunds
> tücks_Wohnungswesen FreiWissTech_DL wirts_DL ÖV Erziehung_Unterr Gesundheit_Sozialwesen Kunst_Unterhaltung_Erholung sonst_DL klein mittel
> , fe
note: LuF omitted because of collinearity
note: BB omitted because of collinearity
note: Verarbeitendes omitted because of collinearity
note: Energieversorg omitted because of collinearity
note: Wasserversorg omitted because of collinearity
note: Baugewerbe omitted because of collinearity
note: Handel omitted because of collinearity
note: Verkehr omitted because of collinearity
note: Gastgewerbe omitted because of collinearity
note: Inform_Kommun omitted because of collinearity
note: Finanz_Versich omitted because of collinearity
note: Grundstücks_Wohnungswesen omitted because of collinearity
note: FreiWissTech_DL omitted because of collinearity
note: wirts_DL omitted because of collinearity
note: ÖV omitted because of collinearity
note: Erziehung_Unterr omitted because of collinearity
note: Gesundheit_Sozialwesen omitted because of collinearity
note: Kunst_Unterhaltung_Erholung omitted because of collinearity
note: sonst_DL omitted because of collinearity
Fixed-effects (within) regression Number of obs = 57,217
Group variable: ID Number of groups = 18,389
R-sq: Obs per group:
within = 0.0007 min = 1
between = 0.0002 avg = 3.1
overall = 0.0001 max = 5
F(2,38826) = 13.26
corr(u_i, Xb) = -0.0121 Prob > F = 0.0000
---------------------------------------------------------------------------------------------
ETR_un | Coef. Std. Err. t P>|t| [95% Conf. Interval]
----------------------------+----------------------------------------------------------------
LuF | 0 (omitted)
BB | 0 (omitted)
Verarbeitendes | 0 (omitted)
Energieversorg | 0 (omitted)
Wasserversorg | 0 (omitted)
Baugewerbe | 0 (omitted)
Handel | 0 (omitted)
Verkehr | 0 (omitted)
Gastgewerbe | 0 (omitted)
Inform_Kommun | 0 (omitted)
Finanz_Versich | 0 (omitted)
Grundstücks_Wohnungswesen | 0 (omitted)
FreiWissTech_DL | 0 (omitted)
wirts_DL | 0 (omitted)
ÖV | 0 (omitted)
Erziehung_Unterr | 0 (omitted)
Gesundheit_Sozialwesen | 0 (omitted)
Kunst_Unterhaltung_Erholung | 0 (omitted)
sonst_DL | 0 (omitted)
klein | 1.206493 .2799578 4.31 0.000 .6577689 1.755217
mittel | .5408651 .1822532 2.97 0.003 .1836443 .898086
_cons | 27.68036 .0611299 452.81 0.000 27.56054 27.80017
----------------------------+----------------------------------------------------------------
sigma_u | 13.079854
sigma_e | 12.638967
rho | .51713756 (fraction of variance due to u_i)
---------------------------------------------------------------------------------------------
F test that all u_i=0: F(18388, 38826) = 2.44 Prob > F = 0.0000
Kind regards
Can

Comment