I am currently estimating wage equations that are not identically specified (i.e., we have immigrant specific variables for the immigrant wage estimation) and thus use the options 'noisily and relax' to circumvent this issue; we also use detail as when it comes to analyzing the immigrant-specific variables from one equation (that are not in the other), they should be included as the endowment portion of the decomposition (not the unexplained portion).
However, I compared the results are very sensitive to changes to changes in the construction of the dummy variables which I suspect is due to multicollinearity - the three included variables are 'years of assimilation' where the base group is the native born population, English proficiency (where the majority of the baseline group is the native born population), and country of origin (again, baseline group is native born population). If I change my country of origin groupings to include 3 versus 5 dummies, for example, the magnitude of the explained and unexplained portions increase significantly.
oaxaca incwage normalize(age_cat1 age_cat2 age_cat3) normalize(married not_married never_married) normalize(white_nh black_nh asian_nh hispanic_cat) nchild hs_dipl some_coll bach_plus masters_plus south_cat midwest_cat west_cat npboss90 years of assimilation english_cat country_origin if male == 1 [pweight = perwt], by(foreign_born) pooled noisily relax
Am I correct in thinking that is indeed a multicollinearity issue?
However, I compared the results are very sensitive to changes to changes in the construction of the dummy variables which I suspect is due to multicollinearity - the three included variables are 'years of assimilation' where the base group is the native born population, English proficiency (where the majority of the baseline group is the native born population), and country of origin (again, baseline group is native born population). If I change my country of origin groupings to include 3 versus 5 dummies, for example, the magnitude of the explained and unexplained portions increase significantly.
oaxaca incwage normalize(age_cat1 age_cat2 age_cat3) normalize(married not_married never_married) normalize(white_nh black_nh asian_nh hispanic_cat) nchild hs_dipl some_coll bach_plus masters_plus south_cat midwest_cat west_cat npboss90 years of assimilation english_cat country_origin if male == 1 [pweight = perwt], by(foreign_born) pooled noisily relax
Am I correct in thinking that is indeed a multicollinearity issue?
