Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata Decimal Precision Problem

    I have a csv file containing numeric results with a mixture of decimal precision (e.g. 23 47.1, 2.35)

    When I import to Stata the integers are fine but the with the decimals the decimal precision changes thus: 5.1 becomes 5.0999999 and 0.95 becomes .94999999

    The display format is fine. i.e. when I view the Stata dataset I see 5.1 as 5.1 but the underlying stored value contains the unwanted decimals.
    This is important because when I export the data to my final destination format (SAS V9) the underlying result (5.0999999) is the one that I see in my SAS file.

    Any ideas how I can stop this from happening?

    I already tried importing the results as a string but when I destring them I get the same result! I've also tried using a do file to import both a csv and an excel file and I have tried using STAT Transfer with no luck.

  • #2
    You cannot solve this problem because it is not possible to store 5.1 exactly in a binary number system, any more than 5 1/3 can be stored exactly in a decimal number system.

    See help precision for more information. You will have to use, I think, the SAS FORMAT command to associate a display format with the variable in SAS.

    Here is an example that shows that 5.1 is represented in hexadecimal as a repeating fraction ending with an infinite string of 6's, much like 5 1/3 is represented in decimal by a repeating fraction ending with an infinite string of 3's. Ending with fewer than an infinite number of digits results in a value that is slightly smaller than 5.1, and with enough precision in your display format, this becomes apparent.
    Code:
    . clear
    
    . set obs 1
    number of observations (_N) was 0, now 1
    
    . generate double x = 5.1
    
    . display x
    5.1
    
    . display %10.8f x
    5.10000000
    
    . display %20.18f x
    5.099999999999999645
    
    . display %21x x
    +1.4666666666666X+002
    Last edited by William Lisowski; 30 Mar 2017, 19:14.

    Comment


    • #3
      I have other mixed decimal numeric variables which display correctly though so I'm still hoping there is some way I can solve this...

      I have attached a screenshot from my final SAS dataset showing my 'low and 'high' variables formatting correctly but not the result_n variable.

      Any further thoughts appreciated...

      Click image for larger version

Name:	Screen Shot 2017-03-31 at 12.17.36 pm.png
Views:	1
Size:	50.8 KB
ID:	1381218

      Comment


      • #4
        I think I have just figured this out. Storing result_n as a double seems to do the trick. I was float previously...

        Comment


        • #5
          Note that in your example in post #3, the values 3.5 and 8.5 both have exact hexadecimal representations, as the fractional portions are (negative) powers of two.

          Sending a double precision value to SAS ensures that whatever display format SAS is using will round to the format you would like to see, as would assigning a format. Understand that the difference in using double precision is just that it reduces the inaccuracy caused by the truncation of the infinite sequence of digits enough that SAS will paper over the remaining difference when it rounds the number using the display format it is applying.

          Comment


          • #6
            Thanks for the feedback, William. I'm going to trust SAS to paper over those differences

            Comment

            Working...
            X