Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • "Mcenter" function yields mean other than 0

    Hi there,

    have repeatedly tried to center a variable at its mean, using the "mcenter" function (https://econpapers.repec.org/softwar...de/s445601.htm).

    This centered variable should have a mean of 0, but somehow it does not.

    I first recoded my variable (age) so that it is categorised by decades:

    . sum age

    Variable | Obs Mean Std. Dev. Min Max
    -------------+---------------------------------------------------------
    age | 69,260 42.58625 16.27877 16 103

    .
    . recode age (16/19 = 1 "16-19") (20/29 = 2 "20-29") (30/39 = 3 "30-39") (40/49 = 4 "40-49") (50/59 = 5 "50-59") (60/69 = 6 "60-69") (7
    > 0/79 = 7 "70-79") (80/89 = 8 "80-89") (90/99 = 9 "90-99") (100/103 = 10 "100-103"), generate(agedec)
    (69260 differences between age and agedec)


    I then wanted to center the variable "agedec" at their mean, hence creating a new variable "C_agedec" with a mean of 0. However, as you can see below, the mean is not exactly 0.
    .
    . mcenter agedec

    Variable | Obs Mean Std. Dev. Min Max
    -------------+---------------------------------------------------------
    C_agedec | 69,260 1.60e-09 1.656345 -2.813947 6.186052



    Is this simply a rounding error or something more serious? I don't understand why this function (mcenter) is not exactly doing what it is supposed to do?


    Many thanks in advance to anyone who may know what is going on here

    Last edited by Linda Iva; 24 Sep 2020, 03:37.

  • #2
    It is a rounding off error. 1.60e-09 is a very small number.

    Comment


    • #3
      Some miscellanous footnotes to @Joro Kolev's answer to your question.

      mcenter is a command, not a function.

      Note that

      Code:
      gen decade = floor(age/10)
      would get you there a little more directly.

      I don't understand the rationale for centring decade on its mean. You have degraded your ages to an ordered categorical variable that does makes some sense directly. Suppose the mean decade is 3.764321 or whatever. How do new values of -2.764231 -1.764231 and so on help either description or model fitting or interpretation?

      Dependence of many things on age is often highly nonlinear and people don't agree readily on how to manage it, but splines and fractional polynomials have their fans.

      Comment


      • #4
        Originally posted by Nick Cox View Post
        [FONT=arial] You have degraded your ages to an ordered categorical variable
        A footnote to Nick's footnote:

        I often use age in decades, as getting one year older is usually too small a change to be meaningful (for adults, of course). However, I do so without categorizing age:
        Code:
        gen decade = age/10
        I also often center age, but not at the mean. The mean in a sample obviously differs from sample to sample, making it harder to compare surveys. More importantly, the mean age in a survey is often very far removed from the mean age in the population. Children are often excluded by design. The elderly are often underrepresented because either they can't participate due to health reasons or are afraid the participate. These biases differ from survey to survey, e.g. because they used different ways of collecting data, making it even more impossible to compare surveys. So often I find the mean age in a survey is pretty much a meaningless number. Instead I usually center at some number that is meaningful for my application, e.g. 30 or 40.

        So my code very often includes the line
        Code:
        gen agec = (age-40)/10
        ---------------------------------
        Maarten L. Buis
        University of Konstanz
        Department of history and sociology
        box 40
        78457 Konstanz
        Germany
        http://www.maartenbuis.nl
        ---------------------------------

        Comment


        • #5
          Many thanks to all of you, that was all very insightful and helpful!

          In response to Nick Cox, I am trying to replicate a piece of research with newer data and want to stick to the original methods to make it comparable. Maarten Buis' is really helpful here to understand why this has been done in the first place.

          Comment


          • #6
            Thanks for the story.

            Comment

            Working...
            X