Does cascade replace round off large values?

Lucy James

Join Date: Jan 2023

Posts: 2
#1

Does cascade replace round off large values?

29 Jan 2023, 17:00

Hello! Thank you for your help in advance.

I am using cascade replace to replace missing values with a function of the lagged value & another variable.

I have run into issues that only seem to come up when I am replacing a "large" value. Specifically (see reproducible example below), the cascade replace generates the correct answer when my initial value is <=10 million, but the wrong value when my initial value is >=20 million.

Code:

* When initial value set to 20 million, doesn't work * Missing values get replaced with running_count[_n-1], whether or not third_obs=0 clear set obs 10 gen third_obs = (_n==3) gen running_count = 20000000 if _n==1 replace running_count = running_count[_n-1] +third_obs[_n] if missing(running_count) tab running_count /*shows only 1 unique value*/ * When initial value set to 10 million, does work clear set obs 10 gen third_obs = (_n==3) gen running_count = 10000000 if _n==1 replace running_count = running_count[_n-1] +third_obs[_n] if missing(running_count) tab running_count /*shows 2 unique values*/

I don't know what to think, especially since 10 mil and 20 mil have the same number of digits.. Any help is appreciated
Tags: cascade, replace
Nick Cox

Join Date: Mar 2014

Posts: 35724
#2

29 Jan 2023, 17:30

The number of decimal digits is less important than whether there are enough bits in your chosen variable type to hold values with full accuracy. You need to use a double not the default float.
2 likes
Comment

William Lisowski

Join Date: Dec 2014
Posts: 10150

29 Jan 2023, 18:12

Here are the limits on storage of decimal integers with full accuracy in the various numeric storage types. The fixed-point variables lose the 27 largest positive values to missing value codes; the similar loss for floating point variables occurs only for the largest exponent, so it doesn't affect the much smaller integer values.

You will see that 10,000,000 worked for you, because a float will hold precisely a decimal integer of up to 16,777,216 but 20,000,000 exceeds that and cannot be stored with full precision.

byte - 7 bits	-127	100
int - 15 bits	-32,767	32,740
long - 31 bits	-2,147,483,647	2,147,483,620
float - 24 bits	-16,777,216	16,777,216
double - 53 bits	-9,007,199,254,740,992	9,007,199,254,740,992

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35724
#4

30 Jan 2023, 02:26

It seems that a long would serve also for #1.
1 like
Comment
Lucy James

Join Date: Jan 2023

Posts: 2
#5

30 Jan 2023, 15:03

Thanks so much to both of you!

For any future readers with similar problems, I also found this stata blog helpful for building my understanding of what I was doing wrong: https://blog.stata.com/2012/04/02/th...-to-precision/
1 like
Comment

Announcement