Dear STATA list,
happy Monday.
I am interested in estimating the effect of language proficiency (measured by "very good german command") for immigrants in Germany. To this end, I run linear regression models of wages on demographics, including immigrant status, a dummy that captures "very good German proficiency" and other demographic characteristics. To enrich the specification, I also include interaction terms, subsequently interacting immigrant status x education, immigrant status x very_good_german, education x very_good_german and finally, also including a triple interaction effect.
My code is as follows:
The problem I am facing is the following. In my data, I have natives and immigrants. Natives do not report their language proficiency, hence, I impute the value "very_good_ger_command == 1" for natives. That is, I assume they have a very good german proficiency. When running the regression loop as above, STATA drops the interaction effects because of collinearity.
For example, when running the fourth specification (reg log wages on x4), I get:
I believe this problem arises because of my imputation for all natives, the category "immigrant = 0, very_good_ger_command = 0" does not exist.
I would like to set the base category as natives with good german command, that is, immig = 0, very_good_ger_command = 1. However, doing this using the method ib#. does not work for me.
Is my problem a syntax issue (of specifying base levels)?
Or is there something fundamentally wrong with my specification, and do I perhaps need to drop the variable "very_good_ger_command" as a stand-alone so my regression can be identified?
I am grateful for any input or tips. Thank you!
happy Monday.
I am interested in estimating the effect of language proficiency (measured by "very good german command") for immigrants in Germany. To this end, I run linear regression models of wages on demographics, including immigrant status, a dummy that captures "very good German proficiency" and other demographic characteristics. To enrich the specification, I also include interaction terms, subsequently interacting immigrant status x education, immigrant status x very_good_german, education x very_good_german and finally, also including a triple interaction effect.
My code is as follows:
Code:
local Y ln_wages_gro
global x1 immigrant##i.educ_level sex age age_sq married no_children i.educ_level years_work_exp i.occup_combined
global x2 immigrant##i.educ_level immigrant##very_good_ger_command sex age age_sq married no_children i.educ_level years_work_exp i.occup_combined
global x3 immigrant##i.educ_level immigrant##very_good_ger_command educ_level#very_good_ger_command sex age age_sq married no_children i.educ_level years_work_exp i.occup_combined
global x4 immigrant##i.educ_level##very_good_ger_command sex age age_sq married no_children i.educ_level years_work_exp i.occup_combined // triple interaction effect
local X x1 x2 x3 x4
*Loop regressions using reghdfe
foreach y of local Y {
foreach x of local X {
eststo reg_`y'_`x': reghdfe `y' ${`x'} , ///
absorb(cluster_var syear) vce(cluster cluster_var)
}
}
For example, when running the fourth specification (reg log wages on x4), I get:
Code:
note: 0b.immigrant#0b.very_good_ger_command omitted because of collinearity
note: 0b.immigrant#1b.educ_level#0b.very_good_ger_command omitted because of collinearity
note: 1o.immigrant#3o.educ_level#0b.very_good_ger_command omitted because of collinearity
note: 889.occup_combined omitted because of collinearity
HDFE Linear regression Number of obs = 367,217
Absorbing 2 HDFE groups F( 30, 83) = 2930.43
Statistics robust to heteroskedasticity Prob > F = 0.0000
R-squared = 0.4914
Adj R-squared = 0.4912
Within R-sq. = 0.4422
Number of clusters (cluster_var) = 84 Root MSE = 0.6110
(Std. err. adjusted for 84 clusters in cluster_var)
------------------------------------------------------------------------------------------------------------
| Robust
ln_wages_gro | Coefficient std. err. t P>|t| [95% conf. interval]
-------------------------------------------+----------------------------------------------------------------
1.immigrant | .0204051 .025504 0.80 0.426 -.0303214 .0711315
|
educ_level |
2 | .1357691 .0266024 5.10 0.000 .082858 .1886801
3 | .3297254 .0292414 11.28 0.000 .2715655 .3878854
|
immigrant#educ_level |
1 2 | -.0371757 .0263592 -1.41 0.162 -.089603 .0152516
1 3 | -.1328072 .0353139 -3.76 0.000 -.2030451 -.0625693
|
1.very_good_ger_command | -.1049976 .0209216 -5.02 0.000 -.1466099 -.0633854
|
immigrant#very_good_ger_command |
0 0 | 0 (empty)
1 1 | 0 (omitted)
|
educ_level#very_good_ger_command |
2 1 | .1636651 .0269149 6.08 0.000 .1101325 .2171977
3 1 | .2293729 .0270932 8.47 0.000 .1754856 .2832602
|
immigrant#educ_level#very_good_ger_command |
0 1 0 | 0 (empty)
0 2 0 | 0 (empty)
0 3 0 | 0 (empty)
1 2 1 | 0 (omitted)
1 3 1 | 0 (omitted)
|
sex | .3147661 .0136618 23.04 0.000 .2875934 .3419388
age | .0699323 .003348 20.89 0.000 .0632732 .0765913
age_sq | -.0974664 .0034906 -27.92 0.000 -.104409 -.0905238
married | -.0496023 .0130072 -3.81 0.000 -.0754732 -.0237315
no_children | -.0253526 .0028548 -8.88 0.000 -.0310306 -.0196746
years_work_exp | .029932 .0006777 44.17 0.000 .0285841 .0312799
|
occup_combined |
82 | -.1641463 .0176071 -9.32 0.000 -.1991662 -.1291265
83 | -.4081211 .0177827 -22.95 0.000 -.4434902 -.372752
84 | -.5554699 .0182121 -30.50 0.000 -.591693 -.5192467
85 | -.8531703 .0193025 -44.20 0.000 -.8915621 -.8147785
86 | -.7304313 .0402182 -18.16 0.000 -.8104237 -.6504389
87 | -.725796 .0225927 -32.13 0.000 -.7707321 -.68086
88 | -.745428 .0276917 -26.92 0.000 -.8005057 -.6903503
89 | -1.116623 .0284847 -39.20 0.000 -1.173278 -1.059968
881 | .7054542 .0220244 32.03 0.000 .6616486 .7492599
882 | .7558083 .0269996 27.99 0.000 .7021071 .8095094
883 | .5254447 .0224478 23.41 0.000 .480797 .5700924
884 | .4204903 .0270373 15.55 0.000 .3667142 .4742664
885 | .1750119 .0280122 6.25 0.000 .1192968 .230727
886 | .1494267 .0320236 4.67 0.000 .085733 .2131205
887 | .2900686 .0192637 15.06 0.000 .2517539 .3283832
888 | .3160587 .0176544 17.90 0.000 .2809449 .3511726
889 | 0 (omitted)
|
_cons | 5.479861 .083596 65.55 0.000 5.313592 5.64613
I would like to set the base category as natives with good german command, that is, immig = 0, very_good_ger_command = 1. However, doing this using the method ib#. does not work for me.
Is my problem a syntax issue (of specifying base levels)?
Or is there something fundamentally wrong with my specification, and do I perhaps need to drop the variable "very_good_ger_command" as a stand-alone so my regression can be identified?
I am grateful for any input or tips. Thank you!
