Why aren't regional covariates taken out by dummy variables at the same level?

Jeff Cole

Join Date: Sep 2023

Posts: 3
#1

Why aren't regional covariates taken out by dummy variables at the same level?

10 Sep 2023, 03:35

Hi all,

I'm analyzing a dataset based on a health-related cross-sectional survey from 15 administrative regions conduced in 2019. The dataset contains variables at two levels of observation: individual and region. I'm trying to analyze how individual's choice of healthcare providers is shaped by both individual economic conditions and regional economic characteristics. In one of the linear probability models I estimated, I inadvertently included both region dummy variables (to account for regional heterogeneity) and regional level covariates. Based on my statistical knowledge, including regional dummies and regional covariates in such a context (there is no within-region variation in the regional covariates) would result in the coefficients of the latter not being estimated. But I was surprised to find that the model produced estimations both for the dummies and regional covariates, except that a number of dummies are omitted in addition to the baseline (I later verified that the number of omitted dummies always equates the number of regional covariates specified). I'm having a hard time understand why this is the case (see the Stata output below). Any help or thought on this is much appreciated. I'm relative new to survey data analysis but is quite familiar with panel data methods. Am I missing something big here?

Last edited by Jeff Cole; 10 Sep 2023, 03:39.
Tags: None
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2200
#2

10 Sep 2023, 07:01

That happens because when a group of perfectly collinear variables is included, Stata drops the last ones in the group. So it leaves the three variables that have only regional variation and drops three of the regional dummies. It’s a good reminder that we should do the modeling ourselves, and not rely on Stata to do it for us.
2 likes
Comment
George Ford

Join Date: Aug 2014

Posts: 3182
#3

10 Sep 2023, 08:50

the order in which the variables appears matter. if you move the i. to the front, it will estimate those and exclude the collinear variables. put it at the end, and it will do the opposite.
Comment
Jeff Cole

Join Date: Sep 2023

Posts: 3
#4

11 Sep 2023, 23:29

Thank you, Jeff and George, for the reply. It makes total sense to me now. This may be a big caveat for those who have been unwittingly enjoying the convenience of having Stata doing all the work.
Comment
George Ford

Join Date: Aug 2014

Posts: 3182
#5

13 Sep 2023, 08:37

I suspect both Jeff and I figured out the way this works looking at results similar to your own. It's real obvious toi me at this point, but only due to experience.
Comment

Announcement

Why aren't regional covariates taken out by dummy variables at the same level?

Comment

Comment

Comment

Comment