Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Rounding with a macro

    Hi,

    To display a number rounded down to a user selected number of decimal places, I wrote some code that gave a surprising answer.

    Attempt 1

    local x = 0.20629048
    local dp = 4
    local x_round = round(`x', 10^-`dp')
    display "this is x `x' this is x_round `x_round'"

    The final line results in the following unexpected output - this is x .20629048 this is x_round .2063000000000001


    From the following you would expect x_round above to be calculated as .2063

    local x = 0.20629048
    local dp = 4
    display 10^-`dp'
    display round(`x', .0001)


    However, I can get around the problem with creating a new macro that evaluates the number of decimal places

    Attempt 2

    local x = 0.20629048
    local dp = 4
    local dp_value = 10^-`dp'
    local x_round = round(`x', `dp_value')
    display "this is x `x' this is x_round `x_round'"

    The final line results in the following correct output - this is x .20629048 this is x_round .2063

    Why does Attempt 1 fail?

    Thanks for taking the time to consider this,

    Don Vicendese

  • #2
    There are two separate issues here, only one of which you have explicitly talked about in your post.

    You have to distinguish between the value of a number in Stata's memory, and what you see when you -display- or -list- or -browse-, etc. They are not the same thing, and you control them in different ways.

    So if you want to display a number to four decimal places:
    Code:
    local x = 0.20629048 
    local dp = 4
    local fmt %`=`dp'+1'.`dp'f
    
    display `fmt' `x'
    will do that. If you want the number to show up that way in -list- and -browse- output you can do that with
    Code:
    format `fmt' X
    where X is the name of the variable in question, and local macro `fmt' is defined as in the first code block.

    The above is all about display formatting, and it controls how numbers look in Stata output of various kinds. Read -help format- for more information. Important: the display formatting does nothing to change the value that Stata holds in memory. If you follow -display `fmt' `x'- by -display `x'- you will get the same annoying results you got before.

    That has nothing to do with the actual precision with which Stata holds the number in memory. You can round that number any way you like and you will never get Stata to have a number in memory that equals 0.2063. Why not? Because Stata uses finite precision binary representations and there is no finite binary number that equals 0.2063 decimal. Just as there is no finite decimal number that equals 1/3, and we make do with approximations like 0.3333333333, when Stata tries to store any decimal number in memory, it uses the closest binary number that will fit within the number of bits available in the storage type. (doubles and longs allow more bits of precision than floats.) But unless the decimal number happens to be an exact (negative) power of 2, the representation will always be just an approximation. Even at float precision, the approximation is good enough for most purposes. But exact equality is unobtainable.

    See -help precision- for more.

    Comment


    • #3
      You're confusing two subtly different operations.

      Rounding to so many decimal places is a string operation in which you decide on a display format. You can use display for this purpose and you can put the result in a local macro for later use, as in

      Code:
      . di %5.4f  0.20629048
      0.2063
      
      . local wanted : di %5.4f 0.20629048
      
      . di "`wanted'"
      0.2063
      Rounding to the nearest multiple of a power of 10 is a numeric operation. round() does its best but it can't escape the law of numerics that computers use binary and all that is shown in decimal terms is at best an approximation. Thus 0.2963 --- like almost all multiples of 0.0001 -- is not a number that can be held exactly in binary. It's instructive to work out how many, or more precisely how few, of the 10,000 multiples of 0.0001 from 0.0000 to 0.9999 can be held exactly in binary.

      That your "solution" appears to be a solution, but is really no better solution, follows from the fact that display is just using a default format in your best interests. It's not an honest witness. (And, more generally, it's default formats that prevent from seeing this fact of binary approximations much more frequently. Even if we assume the solution and offer up 0.2063 for display, the default display appears to confirm that we have what we want but asking for more decimal places confirms that we don't really have -- as is quite impossible 0.2063 followed by arbitrarily many zeros:

      Code:
      . display 0.2063
      .2063
      
      . display %23.18f  0.2063
         0.206300000000000011
      Probably hundreds if not thousands of Stata users have been bemused by this, because rather few of us have ever really tackled what happens at machine level, and it's a tribute to computer pioneers that we usually don't have to.

      For much, much more, see resources shown by

      Code:
      search precision
      especially William Gould's blog posts.



      EDIT: If he's Clyde, I must be Bonnie.
      .

      Comment


      • #4
        Thanks Clyde for the answer.

        When I put your suggestion within another string - such as a title for a graph - I get unwanted results as shown in the last two displays

        local dp = 4
        local fmt %`=`dp'+1'.`dp'f
        local x_fmt = "`fmt' `x'"

        display `x_fmt'
        display " this is `x_fmt' "
        display " this is `fmt' `x' "

        These are the outputs

        . display `x_fmt'
        0.2063

        . display " this is `x_fmt' "
        this is %5.4f .20629048

        . display " this is `fmt' `x' "
        this is %5.4f .20629048

        How do I get the correct display within say a graph title command for example?

        title("Chi-square Distribution (df = `df'), Shaded Area = `x'")


        Also - is there a function in Stata that tells me how many numbers there are after the decimal point?

        Thanks again,

        Don

        Comment


        • #5
          is there a function in Stata that tells me how many numbers there are after the decimal point?
          No, and that follows from the answers you have already, underlining that at bottom, all calculations use binary arithmetic.

          No such function can exist in useful form. The question presupposes that there are numbers have unequivocal decimal representations. That is true of integers that can be held in Stata, as you know already. But naturally you don't care about integers, as you know the answer already. Nothing follows the decimal point for integers.

          Otherwise it is true only of numbers that have fractional parts that are exact powers of 2, and of nothing else. The parts in question are .5 .25 .75 .125 ..375 ..625 .875 and so on and relatively speaking they are (very) unusual. So, the idea fails even for .1 .2. .3 .4 .6 .7 .8. .9 and so on. (Naturally there are limits on what can be held within finite (meaning, all) storage types, too.)

          Comment


          • #6
            How do I get the correct display within say a graph title command for example?

            title("Chi-square Distribution (df = `df'), Shaded Area = `x'")
            Code:
            title("Chi-square Distribution (df = `df'), Shaded Area = `:display %5.4f `x''")
            // OR, MORE GENERALLY,
            
            title("Chi-square Distribution (df = `df'), Shaded Area = `:display `fmt' `x''")
            // WHERE LOCAL MACRO fmt IS DEFINED AS PREVIOUSLY
            should do the trick.

            Comment


            • #7
              Great - thanks very much Clyde and Nick. That is completely clear and very useful.

              My question about the numbers after the decimal place was imprecise.
              Some of the `x' will actually be user entered with less than 4 decimal places and I didn't want to display at 4 decimal places on those occasions

              I realised how I could create a string variable from the entered number, count the numbers after the decimal point, and, if number of decimal places <= 4, remove any trailing zeroes and then temporarily change the formatting accordingly.

              Many regards,

              Don

              Comment

              Working...
              X