Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Standardizing variables via "center" command

    Hey everyone,

    I came across something peculiar. Apparently, while "center" should create a mean zero and standard deviation of one variable, I had an application where the standard deviation of the standardized variable was two. I did it manually and created the right version of the variable no problem, but was surprised that the quick shortcut with "center" didn't work. Has anyone encountered that?

  • #2
    I presume you are talking about center (SSC).

    The help makes clear that by default center only centres [sorry, I am British and prefer English spelling] to mean zero. Unit SD (variance) requires an explicit option choice. In the absence of any evidence, my guess is that this underlies your experience.

    Comment


    • #3
      Oh, that explains it. It said the default is to just standardize -- which I interpreted as standard normal. Thank you for clarifying.

      Comment


      • #4
        My own view is to agree that repeated wording in the help

        center (or standardize)
        with only linguistic variation is indeed potentially confusing. I'll draw this thread to Ben Jann's attention as author of the program.

        But you're introducing here another confusion. If a variable is standardized to mean 0 and SD 1 that in itself does nothing to make the variable closer to a normal distribution. I can't see that any of the options of center do anything to distribution shape, nor is it the intention of the command to alter that.

        Unfortunately "normalise" is another weasel word: across even statistical science it has a variety of meanings. (No offence to any Stata-using weasels reading this.)

        Comment


        • #5
          Hah, yes. Regarding the distribution, the more precise way for us to say this is that we can make a variable approximate a standard normal by subtracting the mean and dividing by the sd, but it may not necessarily actually follow a normal distribution -- so perhaps there is some skewness in the tails. But, mathematically we can still construct a random variable to have mean zero and sd of one.

          Comment


          • #6
            I suppose that you can say that a variable with mean 0 and SD 1 is closer to a normal distribution with mean 0 and SD 1 than it was when it had arbitrary mean and SD.

            But any differences not just in skewness, but also in kurtosis, gaps, spikes, outliers or other aspects of shape remain exactly as they were before changing location and scale.

            So I don't know any sense in which that is a helpful statement, for either an audience who knows less statistics or one who knows more statistics.

            Comment


            • #7
              An update of the center command is now available from the SSC Archive. To install the update, type
              Code:
              ssc install center, replace
              or use the adoupdate command.

              To avoid confusion, the wording in the help file is now "... center (or, optionally, standardize) ...". Further changes are as follows.
              • iweights and pweights are now allowed in addition to aweights and fweights
              • center now displays a note about the variables that have been generated or modified
              • addlabel("") can now be used to suppress the default label suffix
              • replace did not work as advertised if inplace was specified; this is fixed

              Comment

              Working...
              X