Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Rounding in stata or other issues

    Hi

    I am not sure why stata does not seem to read the number 0.001 as such. See below in my dataset. The variable myvar has the value 0.001 and when I ask Stata to drop this number, it does not do so.
    Is there any issue with rounding that makes stata read this as being higher than 0.001, for example, 0.001000222?? The data was csv and then converted in Stata format
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float myvar
    0
    0
    0
    0
    0
    0
    -18.807
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    .002
    0
    0
    0
    0
    0
    0
    -.217
    -.042
    0
    -1098
    -25623
    0
    0
    0
    .001
    0
    0
    end
    [/CODE]

    . drop if myvar==0.001
    (0 observations deleted) // nothing is deleted


  • #2
    What you see has been much aired and explained. In essence, it is a side-effect of the fact that computers work in binary, not decimal, so numbers such as 0.001 cannot be held as such but only through binary approximations. In your case, you need to compare with the float approximation to 0.001, as here:


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float myvar
     -25623
      -1098
    -18.807
      -.217
      -.042
          0
       .001
       .002
    end
    . 
    drop if myvar == float(0.001)
    
    list 
    
         +---------+
         |   myvar |
         |---------|
      1. |  -25623 |
      2. |   -1098 |
      3. | -18.807 |
      4. |   -.217 |
      5. |   -.042 |
         |---------|
      6. |       0 |
      7. |    .002 |
         +---------+
    For much more detail, search this forum for mentions of rounding and precison, or consider this reading list, which fortunately is repetitive, so many people would not yet beyond the initial blog posts:

    Blog . . . . . . . . . . . . . . . . . . The penultimate guide to precision
    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . W. Gould
    4/12 http://blog.stata.com/2012/04/02/the-penultimate-
    guide-to-precision/

    Blog . . . . . . . . . . . . . . . . . . . . Precision (yet again), part II
    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . W. Gould
    6/11 http://blog.stata.com/2011/06/23/pre...again-part-ii/

    Blog . . . . . . . . . . . . . . . . . . . . Precision (yet again), part I
    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . W. Gould
    6/11 http://blog.stata.com/2011/06/17/pre...-again-part-i/

    Blog . . . . . . . . . . . . . . . . . How to read the %21x format, part 2
    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . W. Gould
    2/11 http://blog.stata.com/2011/02/10/
    how-to-read-the-percent-21x-format-part-2/

    FAQ . . . . . . . . . . . . . . . . . . . Results of the mod(x,y) function
    . . . . . . . . . . . . . . . . . . . . . N. J. Cox and T. J. Steichen
    4/15 Why does the mod(x,y) function sometimes give
    puzzling results?
    Why is mod(0.3,0.1) not equal to 0?
    http://www.stata.com/support/faqs/data-management/
    mod-function/

    FAQ . . . . . . . . . Comparing floating-point values (the float function)
    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . J. Wernow
    4/15 Why can't I compare two values that I know are equal?
    http://www.stata.com/support/faqs/data-management/
    comparing-floating-point-values/

    FAQ . . . . . . . . . . . . . . . . . The accuracy of the float data type
    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . W. Gould
    5/01 How many significant digits are there in a float?
    http://www.stata.com/support/faqs/data-management/
    float-data-type/

    FAQ . . . . . . . . . Why am I losing precision with large whole numbers?
    . . . . . . . . . . . . . . . . . . UCLA Academic Technology Services
    7/08 https://stats.idre.ucla.edu/stata/faq/why-am-i-losing-
    precision-with-large-whole-numbers-such-as-an-id-variable/


    SJ-8-2 pr0038 Mata Matters: Overflow, underflow & IEEE floating-point format
    . . . . . . . . . . . . . . . . . . . . . . . . . . . . J. M. Linhart
    Q2/08 SJ 8(2):255--268 (no commands)
    focuses on underflow and overflow and details of how
    floating-point numbers are stored in the IEEE 754
    floating-point standard

    SJ-6-4 pr0025 . . . . . . . . . . . . . . . . . . . Mata matters: Precision
    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . W. Gould
    Q4/06 SJ 6(4):550--560 (no commands)
    looks at programming implications of the floating-point,
    base-2 encoding that modern computers use

    SJ-6-2 dm0022 . Tip 33: Sweet sixteen: Hexadec. formats & precision problems
    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
    Q2/06 SJ 6(2):282--283 (no commands)
    tip for using hexadecimal formats to understand precision
    problems in Stata

    Comment


    • #3

      The new command round_exact (available on SSC) will solve this issue. Try this:

      clear
      input double price
      0.3
      0.105
      2.675
      1.005
      0.005
      end

      round_exact1 price, d(2) generate(price_r)
      assert price_r[1] == 0.30
      assert price_r[2] == 0.11
      assert price_r[3] == 2.68
      assert price_r[4] == 1.01
      assert price_r[5] == 0.01


      @Mike, In your case:


      clear
      input float myvar
      0
      0
      0
      0
      0
      0
      -18.807
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
      .002
      0
      0
      0
      0
      0
      0
      -.217
      -.042
      0
      -1098
      -25623
      0
      0
      0
      .001
      0
      0
      end


      . recast double myvar

      . round_exact myvar, d(3) replace
      Variable myvar updated. (36 real change(s) made)

      .
      . drop if myvar == .001
      (1 observation deleted)

      FYI: I will fix the command to recast the data type to double so that no recast would be necessary. I will also fix an error indicating the number of cases replaced. Update command will be submitted to SSC early next week.
      Last edited by Anne Shi; 10 Apr 2026, 12:51.

      Comment


      • #4
        The program round_exact has been revised and updated to version 2.7.2. This update automatically recasts variables to double precision before processing to maximize numerical accuracy. It also correctly reports the number of observations replaced during execution. To install, type:

        ssc install round_exact

        It has long been believed that exact decimal rounding cannot be achieved on a binary computer. This small program aims to do just that. I am sure there are many improvements still to be made, and feedback, suggestions, and critiques are greatly appreciated.

        Anne Shi

        Comment

        Working...
        X