Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Categorical variables

    Hi

    I would start by apologizing since am posting a statistical rather than a Stata question on this forum. I believe there are members with strong data skills that can be of great help.

    I would like to find a general picture of an indicator aggregated at some level (region). The indicator is a categorical variable and we know the average may not give a true reflection. What could be the best approach in aggregating this indicator? Can I still use the average?

    Thanks in advance!

    Best,
    Stephen.

  • #2
    I assume that with a categorical variable you mean something like religion: there are distinct categories (catholic, protestant, Muslim, etc.) and there is no order.

    In general: no, the mean of a categorical variable is usually meaningless. (What does an "average religion" of 3.887 mean?)

    The exception occurs when your categorical variable has only two categories and is coded 0, 1 (e.g. non-religious versus religious). In that case the mean is the proportion of 1s (e.g. the proportion of religious people in a region)

    You can break up a categorical variable in a set of indicator (dummy) variables and compute the means of those. In that case you will have to leave one of these means out, as the last proportion contains no additional information: If you have three categories, and you know two proportions, then you also know the third. So adding all three would cause perfect multicolinearity and Stata will drop one out for you. Moreover, you cannot interpret these coefficients in the ceteris paribus way: one proportion cannot go up while keeping all other proportions equal. Instead, the interpretation of the coefficient is for a unit increase of one proportions and the necessary decrease in the other proportions all happens within the reference category
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Thanks Maarten for your thoughts. Much appreciated.

      Sorry I wasn't clear in my post but the categorical variable is a likert scale variable (Very bad, Bad, Good, Very good). Still the mean may not be the best summary statistic to generalize.

      The analysis I would want is to have a spatial map of the perceptions of individuals in different space. Consequently, I thought of aggregating this perceptions and then have the map to determine if there exists any spatial difference.

      Best,
      Stephen.

      Comment

      Working...
      X