Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reversing scale for variables like life expectancy, enrollment rates

    Hello wonderful Stata helpers,
    I know this is less Stata-specific and more conceptual, but I would appreciate some clarification if possible.
    I am building a fragility index. In the end I want there to be a number that represents each country's fragility in a given year, with a high number meaning very fragile and a low number meaning pretty stable. For the most part, the indicators that make up the index follow this pattern. For indicators that had data that meant the opposite, I simply reversed the scale. They were all measured on a scale of 0 - 4, so when necessary I use the following method:
    sum varname
    gen varname_rev = (max(varname)+1) - varname

    example code:
    sum v2caviol
    gen v2caviol_rev = 5 - v2caviol
    (side question: even though the indicator scale was 0-4, some of the actual observations can go as low as negative 2 and as high as 5, I think from a normalization process. Is this method of reversing still valid? or should I go with the observed max?)

    Main question #1: For an indicator such as school enrollment, the scale is from 0 to 100 percent. Can I still use the same method?
    Main question #2: For an indicator such as life expectancy, there is no predetermined scale, so this method seems completely inappropriate. The observed maximum is 85.3 but I think there's something wrong with subtracting life expectancy from 86.3 - I don't think the indicator would convey the same information as before. So if I can't do this scale reversal, I worry that including this indicator will just add noise, since in general higher life expectancies correlate with less fragility and higher stability. When I aggregate all these indicators into one index, is there a way to include life expectancy that accommodates its meaning?

    Any insights would be very much appreciated!!!

  • #2
    The type of inversion you describe is equally applicable to all of the variables, including life expectancy, in terms of preserving the information in that variable: it is in all cases a bijective mapping. And it doesn't really matter what the "maximum value" you subtract from is because all that you really are striving to do here is get them pointing "in the right direction" and it is the negation that does that.

    That said, you are still left with a heterogeneous bunch of variables measured in different units on different scales, from which you are going to try to synthesize an index. You have not indicated what approach you plan to take for that part of things. The common approaches would be principal components analysis, exploratory factor analysis, or confirmatory factor analysis (in order of my decreasing preference). I would make the case that for any of these approaches you should not tamper with the original variables. Let them go in opposite directions. The analysis will deal with that for you, or not. When I say "or not," it may be that notwithstanding the commonsensical interpretations of these variables, the statistical relationships among them may be such that having one or more of them "pointing the wrong way" gives a better extraction of their commonality. You would see that manifested as loadings with opposite signs in the results. In that case you would then have to decide whether to accept that, or to remove that (those) variable(s) from your index.

    I think that the only approach to creating an index that would require you to first align all the variables in the same, common-sense based, direction is if you plan to take a weighted average using arbitrary weights based on your own (or somebody else's) intuitive judgment. Even there, it might be more transparent to leave the variables as they are and put the negation into the weights.

    Comment


    • #3
      Hi Clyde,

      This is only my second Statalist posting and you have helped me on both! Great to hear from you again. I've also seen many of your other helpful comments on others' posts. Thank you for outlining different aggregation techniques. I've been looking into them for a while to find the best approach for our project, but my advisor and I are going with arbitrary weighting, based on his prior research/intuitive judgment on fragility and its components. I am worried about how I'm going to balance the different types of units, but my professor has repeatedly reminded me that we are trying to make a simple proxy and I could take it less seriously.
      Anyway, I really appreciate the clarity you provided on inversion. I was worried I didn't fully understand the concept and I was going to misuse it, but it really is quite simple. I also appreciate the idea to negatively weigh when aggregating instead of inverting the scale. Transparency is always a top priority when creating knowledge! That's interesting and I'll give it a try.

      Comment

      Working...
      X