Hi, I have a question about omitted variables in regression.
I want to do an OLS regression of depression on two variables, student type (university students versus vocational college students) and relation with parents. But there are a substantial number of cases that have missing values on relation with parents.
I use dummy variable adjustment to handle the missing variables. Specifically, I create a dummy (indicator) variable prelation_miss_dummy that has a value of 1 if relation with parents is observed and 0 if relation with parents is missing. In addition, for the missing values of relation with parents, I substitute a constant value 999. All three variables, student type (university students versus vocational college students), relation with parents, and prelation_miss_dummy, then go into the regression as predictors.
Ideally, the prelation_miss_dummy should be omitted in the regression results due to its collinearity with the relation with parents. However, it isn't omitted actually. I'm not sure why this happens. Will it have an impact on the estimates and p-value of relation with parents?

I want to do an OLS regression of depression on two variables, student type (university students versus vocational college students) and relation with parents. But there are a substantial number of cases that have missing values on relation with parents.
I use dummy variable adjustment to handle the missing variables. Specifically, I create a dummy (indicator) variable prelation_miss_dummy that has a value of 1 if relation with parents is observed and 0 if relation with parents is missing. In addition, for the missing values of relation with parents, I substitute a constant value 999. All three variables, student type (university students versus vocational college students), relation with parents, and prelation_miss_dummy, then go into the regression as predictors.
Ideally, the prelation_miss_dummy should be omitted in the regression results due to its collinearity with the relation with parents. However, it isn't omitted actually. I'm not sure why this happens. Will it have an impact on the estimates and p-value of relation with parents?

Comment