very large integers - precision beyond double storage type

tim bradshaw

Join Date: Feb 2016

Posts: 192
#1

very large integers - precision beyond double storage type

06 Jan 2023, 13:47

Hi there
I understand that there are limits on the precision of storage types, with 'double' offering the best precision for integers. But what's the best practice when dealing with numbers that go beyond the limits of 'double'?
For example, I wonder if anyone could suggest a way of obtaining greater precision for the following:
* Example generated by -dataex-. For more info, type help dataex

Code:

clear input double number 4923329048 234980323 3249890234 2348902348 2349802 2349803234 end gen double numbersq = number^2 format number %25.0g format numbersq %25.0g

With thanks
Tags: None
William Lisowski

Join Date: Dec 2014

Posts: 10150
#2

06 Jan 2023, 14:35

There is no natural way of doing this in Stata. Perhaps if you could answer the following question, someone here might be able to suggest a potential workaround that will meet your specific needs.
Why do you need to know the precise result of squaring a 10-digit number?
Explain what you are trying to accomplish that involves carrying numbers accurate to 20 digits of precision.

Added in edit: This post on the Stata Blog has much to say about precision issues, including false precision.

https://blog.stata.com/2012/04/02/th...-to-precision/

That's why it's important to know what your objective is.

Last edited by William Lisowski; 06 Jan 2023, 14:39.
Comment
Daniel Schaefer

Join Date: Mar 2020

Posts: 814
#3

06 Jan 2023, 17:56

It is, of course, entirely possible to store an infinite amount of precision in finite memory. For any number that you want to calculate, simply invent a new symbol to represent that infinitely precise number. A single byte can represent 2^8 or 256 symbols, so if your symbol for (e.g.) pi is a byte long and happens to be 01011011 then that is just as legitimate and precise a representation of pi as the word "pi" itself (or for that matter, the Greek letter). You might think that I am joking, but you can do a great deal of math just using pi as a symbol. I understand this particular idea is an active area of research in computer science, and one of the foundational ideas in Wolfram Mathematica.

Of course, building such a system in Stata is hugely complicated and impractical, probably even for the Stata developers, and anyway, you probably think of precision in this context as being able to represent a sufficiently large number of digits (whatever "sufficiently large" means). If you want to do calculations with very large numbers and you care about the digits that represent a number, but you don't necessarily care about perfect fidelity, then it can be very useful to use a specific, finite amount of memory to represent the precision you care about. You can then use another block of memory to represent an exponent. The exponent uses scientific notation to tell the computer where the decimal place should go. This is, of course, a floating point number. If you don't have enough space in your finite amount of memory for the precision you need, easy, just double the amount of memory you use. This is, of course, a double. Some languages have even larger precision data types, but it is rare that an application needs that much precision. If you do have a need for that much precision, Fortran might be the language for you.

But now lets say you care about all of the precision the machine can handle, and you need more space than even a Fortran long double. You could store each digit of precision dynamically in an array (or list) of digits, changing the size of the array as needed. You might even be able to implement something like this yourself in Stata. Take whatever number you care about and represent it as a string of characters with the strL type. Now, unfortunately, it might be difficult to do math with a string of characters. However, it would be entirely possible to write a command that preforms addition, and you could theoretically derive most of the other calculations you might need from this command. However, at the end of the day, you're still going to have to think very hard about how you are going to represent all of the digits for "1" divided by "3" in a finite amount of space.

This post is a little tongue and cheek, but I also don't want to belie the fact that this is actually a great question with some interesting and innovative solutions - and in my mind, in all of those solutions the double stands out as the most useful way of dealing with issues related to precision.

Last edited by Daniel Schaefer; 06 Jan 2023, 17:59.
1 like
Comment
tim bradshaw

Join Date: Feb 2016

Posts: 192
#4

07 Jan 2023, 11:38

Many thanks for the helpful replies.
Calculating the square of those numbers (which are made up in the example provided) is actually an interim step for calculating confidence intervals "by hand".
So, if undertook this step as part of a larger calculation that didn't require actually storing the large numbers in a variable, would Stata be able to carry out those calculations with perfect precision?
Thanks again.
Comment
Daniel Schaefer

Join Date: Mar 2020

Posts: 814
#5

07 Jan 2023, 18:08

So you are calculating a sample variance?

No, I'm afraid carrying out the calculation requires storing intermediate values as a double. Even if the intermediate values are not literally stored as a double, the processor will still make calculations as if you were working with the stored doubles.

Here is a subtle question: do you actually need perfect precision here? It may be that you will draw the same (or very similar) conclusions, even with some amount of rounding error.
1 like
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#6

07 Jan 2023, 20:23

In calculating confidence intervals, are all the other numbers in the formulas perfectly precise? Is the formula based on any distributional assumptions that you are assuming to be true?
1 like
Comment

Announcement

very large integers - precision beyond double storage type

Comment

Comment

Comment

Comment

Comment