Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • New program round_exact: exact decimal rounding to address limitations of round()

    Dear Statalist members,

    I’ve written a small program, round_exact (now available on SSC), for cases where decimal‑exact rounding is required and Stata’s round() produces counter‑intuitive results due to binary floating‑point arithmetic. For example,

    .round(2.675, .01)
    .2.67

    .round_exact 2.675, d(2)
    .2.68

    .round(1.005, .01)
    .1

    .round_exact 1.005, d(2)
    .1.01

    As has long been noted on Statalist (e.g., by Nick Cox and others), this behavior of round() is not a bug but a consequence of binary arithmetic; exact decimal rounding is fundamentally unattainable absent decimal arithmetic (or, rhetorically, a decimal‑based computer). The aim here is not to “fix” floating‑point arithmetic, but to provide stable, reporting‑oriented decimal rounding via decimal scaling.

    Suggestions, critiques, and improvement ideas are very welcome.

    Best regards,
    Anne Shi
    Last edited by Anne Shi; 10 Apr 2026, 11:48.

  • #2
    The program round_exact has been revised and updated to version 2.7.2. This update automatically recasts variables to double precision before processing to maximize numerical accuracy. It also correctly reports the number of observations replaced during execution. To install, type:

    ssc install round_exact

    It has long been believed that exact decimal rounding cannot be achieved on a binary computer. This small program aims to do just that. I am sure there are many improvements still to be made, and feedback, suggestions, and critiques are greatly appreciated.

    Anne Shi

    Comment


    • #3
      I think there are a few misconceptions here that might be worth dispelling.

      First, “upcasting” a float to a double doesn’t improve precision. If a number is stored as a float, any information beyond this storage type's precision was lost, and conversion to double precision only allows for extra precision on that error. In other words, lost precision can’t be magically gained going from float to double. If one were to change Stata’s default type to double (-set type-), the apparent rounding error complaints that are solved by this program are mitigated to at least two decimal places.

      Second, I'm not sure what you are referencing by stating "It has long been believe that exact decimal rounding cannot be achieved on a binary computer." As a statement of fact, this is plainly false. There are many industries that are very intensely concerned with maintaining at least some level of decimal precision for their electronically stored data (e.g., finance). Secondly, many other languages also have ways to implement exact decimal representations. For one example, the decimal class in Python is built for exactly this purpose.

      Third, many Stata routines (and Mata functions) internally work in quad-precision (2x double precision) so that these floating-point errors do not compound into meaningful errors. The problem of rounding in this specific context is mainly, I think, an issue of manipulating raw data or manipulating reported results for ease of presentation.

      Comment


      • #4
        To further elaborate on Leonardo Guizzetti's first and second points: Indeed, floating-point arithmetic is very well understood, so the relevant limitations are not a matter of belief but follow directly from how numbers are represented under the widely accepted IEEE 754 standard. Concerning float vs. double recasting, Bill Gould (StataCorp) discusses the limits of what information can and cannot be recovered on The Penultimate Guide to Precision. A program that rounds floats and then recasts to double should issue a warning, and perhaps even return an error, when these limits are encountered.

        Comment


        • #5
          The help file states that round_exact
          [...] ensures that the final decimal result matches the user's expectation for exactness, enabling successful assert checks
          Here's a counter-example:
          Code:
          . round_exact 0.104999994 , d(2)
          .11
          
          . assert r(val) == 0.10 // <- users' (i.e. my) expectation
          assertion is false
          r(9);
          Last edited by daniel klein; 11 Apr 2026, 15:09. Reason: add comment clarifying users' (i.e. my) expectation

          Comment


          • #6
            Thank you so much, Daniel! Greatly appreciated. I will continue trying.

            Comment


            • #7
              Stay tuned for a new version.

              Comment


              • #8
                @Leonardo: Thank you for correcting my misconception. Greatly appreciated. Based on numerious discussions in the Stata Journal, my understanding is that exact decimal rounding is generally not possible with Stata on binary computers. Now that it's achieved by Python, I believe we can achieve it with Stata. Thank you for the tip.

                Comment


                • #9
                  @Leonardo; @daniel:

                  A quick note first: I’ve made further revisions and submitted version 3.0 of round_exact to SSC earlier today.

                  Thanks to both of you for the thoughtful comments — much appreciated. I agree that recasting a float to double cannot recover precision that was never stored; in round_exact, recasting is intended only to prevent further loss during subsequent arithmetic, not to reconstruct information. I also agree that the limits of floating‑point arithmetic are well understood, not a matter of belief, and that exact decimal intent can be preserved when explicit rules or representations are used (for example, by performing decimal arithmetic externally via Python’s decimal class and passing results back to Stata through PyStata).

                  Regarding the counter‑example (round_exact 0.104999994, d(2)), that is a genuine boundary case. Version 2.7.2 handled such values too strictly. In the newly submitted version 3.0, the numeric rounding logic has been refined with storage‑aware, conditional half‑case handling, and this example now rounds to 0.10 as expected.
                  Last edited by Anne Shi; 12 Apr 2026, 12:05.

                  Comment


                  • #10
                    From what I can tell, the latest version of round_exact seems quite elaborate and robust.

                    However, because multiple decimal values can map to the same binary floating-point representation, it remains fundamentally impossible to guarantee exact decimal rounding for arbitrary floating-point inputs. In that sense, the command implements a rule-based approximation to align results with typical decimal expectations. The help file should state this more explicitly, including a clear description of the underlying rule.

                    Relatedly, the documented formula, round(value*10^d)/10^d, should be removed from the help file since it’s not actually used in the code and thus misleading.

                    Comment


                    • #11
                      Thank you, Daniel. This is very helpful. I will revise the help file as you suggested.
                      Yes, because Stata’s numeric system is based on binary storage, the revised command still fails to anticipate some extreme edge cases. I tried to apply some MATA logic to it and failed, too -- I have to admit I am not skillful with MATA in the first place.
                      Do you think it's possible to achieve what Python's decimal class has achieved with Stata/MATA?
                      Last edited by Anne Shi; 14 Apr 2026, 07:48.

                      Comment


                      • #12
                        Do I think it's possible to implement a Python-Decimal-like class in Mata? Probably yes. Should we try to do that? Probably not. Even if we managed to get all edge cases right and achieve reasonable performance, full integration would ultimately require changes at the data storage layer deep within the Stata executable. That’s something only StataCorp can provide.

                        Comment


                        • #13
                          Thank you, @Daniel.
                          Just for you to know that I have updated the help file and submitted a new package. The command remain unchanged. Your thoughts and helps are greatly appreciated.

                          Comment

                          Working...
                          X