Problem with omitted robust SE

You Zhang

Join Date: Dec 2021

Posts: 7
#1

Problem with omitted robust SE

01 Sep 2023, 15:01

Hi folks,

I ran into an issue with a model that I don't quite understand. I am running a two-level multilevel logic regression using melogit. I noticed that the odds ratio for one world region is 1 and it seems that the robust standard error is omitted (see the screenshot below). Further investigation reveals that it has something to do with a variable I named "ruaspecnumber", because the issues is gone when I take it out. I am not sure why this happens. ruaspecnumber is a variable that does not vary wthin a world region, but does vary across regions. Can someone please illuminate me why this happens? I am just worried if there is something wrong with the models.
Tags: None
George Ford

Join Date: Aug 2014

Posts: 3183
#2

03 Sep 2023, 12:25

looks like a dummy trap
1 like
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30164
#3

03 Sep 2023, 16:55

George Ford's answer in #2 is correct. To elaborate a bit more, you can never include in a linear model both a set of indicator variables and another variable that is constant within those indicator variables. In your specific case, the variable ruaspecnumber, because it is constant within region, is necessarily colinear with the region variables. To see that, run -regress ruaspecnumber i.region-, and Stata will show you the exact linear relationship among these variables.

Faced with this colinearity, Stata must do something to identify the model. It must either drop ruaspecnumber, or drop one of the region indicators, or impose some other linear constraint involving ruapsecnumber or the region indicators. The user does not have control over which of these things Stata chooses to do. In this instance, it chose to omit another region indicator (in addition to the one already omitted as the reference or base category). An omitted level of a categorical variable has, implicitly, a coefficient of 0. And in a logistic regression, a coefficient of 0 corresponds to an odds ratio of 1. There is no standard error associated with this coefficient or odds ratio because it is not subject to sampling variation: it is simply stipulated.

The fact of the matter is that a model that involves colinear variables is incapable of identifying effects of the variables involved in the colinearity. So you must go back and review what your research goals are. If estimating the effect of ruaspecnumber is important to achieving your research goals, then you must omit the region indicators. If ruaspecnumber is only in the model as a "control" variable, then you should omit it: its effects are automatically adjusted for by the region indicators in that case.
1 like
Comment
You Zhang

Join Date: Dec 2021

Posts: 7
#4

05 Sep 2023, 09:05

Originally posted by Clyde Schechter View Post

George Ford's answer in #2 is correct. To elaborate a bit more, you can never include in a linear model both a set of indicator variables and another variable that is constant within those indicator variables. In your specific case, the variable ruaspecnumber, because it is constant within region, is necessarily colinear with the region variables. To see that, run -regress ruaspecnumber i.region-, and Stata will show you the exact linear relationship among these variables.

Faced with this colinearity, Stata must do something to identify the model. It must either drop ruaspecnumber, or drop one of the region indicators, or impose some other linear constraint involving ruapsecnumber or the region indicators. The user does not have control over which of these things Stata chooses to do. In this instance, it chose to omit another region indicator (in addition to the one already omitted as the reference or base category). An omitted level of a categorical variable has, implicitly, a coefficient of 0. And in a logistic regression, a coefficient of 0 corresponds to an odds ratio of 1. There is no standard error associated with this coefficient or odds ratio because it is not subject to sampling variation: it is simply stipulated.

The fact of the matter is that a model that involves colinear variables is incapable of identifying effects of the variables involved in the colinearity. So you must go back and review what your research goals are. If estimating the effect of ruaspecnumber is important to achieving your research goals, then you must omit the region indicators. If ruaspecnumber is only in the model as a "control" variable, then you should omit it: its effects are automatically adjusted for by the region indicators in that case.

Thanks very much for your explanation! It helps so much.
Comment

Announcement

Problem with omitted robust SE

Comment

Comment

Comment