Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Omitted variables in regression results due to collinearity

    Hi, I have a question about omitted variables in regression.

    I want to do an OLS regression of depression on two variables, student type (university students versus vocational college students) and relation with parents. But there are a substantial number of cases that have missing values on relation with parents.

    I use dummy variable adjustment to handle the missing variables. Specifically, I create a dummy (indicator) variable prelation_miss_dummy that has a value of 1 if relation with parents is observed and 0 if relation with parents is missing. In addition, for the missing values of relation with parents, I substitute a constant value 999. All three variables, student type (university students versus vocational college students), relation with parents, and prelation_miss_dummy, then go into the regression as predictors.

    Ideally, the prelation_miss_dummy should be omitted in the regression results due to its collinearity with the relation with parents. However, it isn't omitted actually. I'm not sure why this happens. Will it have an impact on the estimates and p-value of relation with parents?

    Click image for larger version

Name:	screencut 1.png
Views:	1
Size:	54.8 KB
ID:	1684236
    Click image for larger version

Name:	screencut 2.png
Views:	1
Size:	331.2 KB
ID:	1684237

  • #2
    Wenye:
    1) the dummy variable approach to deal with missing values gives back, most of the times, biased coefficients (see https://statisticalhorizons.com/is-d...-missing-data/)
    2) your sample is composed of 639 obs, whereas you -e(sample)- is 600. Hence, take a secon look at your dataset.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment

    Working...
    X