Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why aren't regional covariates taken out by dummy variables at the same level?

    Hi all,

    I'm analyzing a dataset based on a health-related cross-sectional survey from 15 administrative regions conduced in 2019. The dataset contains variables at two levels of observation: individual and region. I'm trying to analyze how individual's choice of healthcare providers is shaped by both individual economic conditions and regional economic characteristics. In one of the linear probability models I estimated, I inadvertently included both region dummy variables (to account for regional heterogeneity) and regional level covariates. Based on my statistical knowledge, including regional dummies and regional covariates in such a context (there is no within-region variation in the regional covariates) would result in the coefficients of the latter not being estimated. But I was surprised to find that the model produced estimations both for the dummies and regional covariates, except that a number of dummies are omitted in addition to the baseline (I later verified that the number of omitted dummies always equates the number of regional covariates specified). I'm having a hard time understand why this is the case (see the Stata output below). Any help or thought on this is much appreciated. I'm relative new to survey data analysis but is quite familiar with panel data methods. Am I missing something big here?
    Click image for larger version

Name:	Screenshot 2023-08-28 231108.jpg
Views:	1
Size:	402.2 KB
ID:	1726618

    Last edited by Jeff Cole; 10 Sep 2023, 03:39.

  • #2
    That happens because when a group of perfectly collinear variables is included, Stata drops the last ones in the group. So it leaves the three variables that have only regional variation and drops three of the regional dummies. It’s a good reminder that we should do the modeling ourselves, and not rely on Stata to do it for us.

    Comment


    • #3
      the order in which the variables appears matter. if you move the i. to the front, it will estimate those and exclude the collinear variables. put it at the end, and it will do the opposite.

      Comment


      • #4
        Thank you, Jeff and George, for the reply. It makes total sense to me now. This may be a big caveat for those who have been unwittingly enjoying the convenience of having Stata doing all the work.

        Comment


        • #5
          I suspect both Jeff and I figured out the way this works looking at results similar to your own. It's real obvious toi me at this point, but only due to experience.

          Comment

          Working...
          X