Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with the standarization process

    dear community

    I have 3 variables all measuring something for an extensive set fo countries.
    One of them has a scale from 0 to 10. The other 2 are percentage rates. I would like to standardize the latter two to the former scale.
    I have already generated the ratios of each (3 ratios) and generated the mean where one country has more than one observation,
    f.e. an observation on 1 variable and an observation on another (same year same country).

    Now I do not now how to adjust the ratios since I read only that I need to substract the mean and divide by the std,
    however is that possible with the three ratios?

    Hope that makes sense. Thanks in advance.
    Last edited by BEYHAN VURGUN; 16 Apr 2021, 01:00. Reason: standarize

  • #2
    Use -dataex- to present a sample of your data, and explain what you want with reference to these data. At the vague level of generality that you speak I personally did not understand anything.

    Generally standardising a variable involves subtracting the mean, and dividing by the standard deviation. This make the variable with mean 0 and standard deviation of 1.

    You can check the user contributed command -center- by Ben Jann, and the -egen, std()-

    Comment


    • #3
      One of them has a scale from 0 to 10. The other 2 are percentage rates. I would like to standardize the latter two to the former scale.
      I will peel this one off. Suppose pcrate is a percentage rate. Then

      Code:
      su pcrate, meanonly 
      
      gen pcrate2 = 10 * (pcrate - r(min)) / (r(max) - r(min))
      scales its range to [0, 10].

      Otherwise I don't really follow what you are doing. Calculating a ratio is sometimes exactly what you need for substantive reasons -- as when GDP per head is closer to what you want to measure than GDP -- and sometimes it is just a way to go further from your problem and produce a new variable that is harder to understand and to use. Your call either way.

      This business of combining measures that are supposedly all measures of the same underlying property is fraught. Whatever you do can be a bad idea.

      If i had say 30 measures of the same thing, I would feel obliged to select some as better for my purposes rather than lump them together.

      But you have 3 measures. If I had 3 measures of supposedly the same thing, I wouldn't want to combine them unless a correlation matrix and a scatter plot matrix showed that they really were all proxies for each other. I would otherwise want to keep them separate.

      Comment

      Working...
      X