Stored Values in a New Variable

Rodrigo ds Dias

Join Date: Jan 2021

Posts: 6
#1

Stored Values in a New Variable

17 Jul 2023, 15:21

I have a numeric variable in a dataset (x1) containing integers and numbers with one decimal place.

I'm creating a new variable (x2) that assumes the same value as x1 by running:

gen x2 = x1

When I run this command, some of the stored values in x2 are not exactly the same to some of the values for x1. For example, a value that was originally "7.6" in the original variable x1 is now "7.5999999" in the new variable x2. Another value that was originally "0.6" in x1 is now "0.60000002" in x2.

Why is this discrepancy happening? Is there something I can do to ensure the values in the two variables are exactly the same?
Tags: None
George Ford

Join Date: Aug 2014

Posts: 3153
#2

17 Jul 2023, 15:47

try

clonevar x2 = x1
Comment
Rodrigo ds Dias

Join Date: Jan 2021

Posts: 6
#3

17 Jul 2023, 16:01

That works great. Thank you!
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#4

17 Jul 2023, 16:28

The difference you noticed was almost certainly nothing with the values Stata was holding, because it holds numbers in binary. It was almost certainly because the new variable had a different display format. clonevar copies the display format as well as the values.
Comment

Hemanshu Kumar

Join Date: Mar 2015
Posts: 1411

17 Jul 2023, 16:56

This could also be a precision issue. See the below example:

Code:

clear
input double x1 // note the storage type is double
1.5
2.1
7.6
0.6
end

gen x2 = x1 // note that by default Stata generates numeric variables of storage type float

Now you can see

Code:

. assert x2 == x1
3 contradictions in 4 observations
assertion is false
r(9);


. format %18.16f x1 x2

. list

     +-----------------------------------------+
     |                 x1                   x2 |
     |-----------------------------------------|
  1. | 1.5000000000000000   1.5000000000000000 |
  2. | 2.1000000000000001   2.0999999046325684 |
  3. | 7.5999999999999996   7.5999999046325684 |
  4. | 0.6000000000000000   0.6000000238418579 |
     +-----------------------------------------+

You can "fix" the problem by doing:

Code:

. assert x2 == float(x1) // passes

. gen double x3 = x1 // match the storage type and thus precision of the original variable

. format %18.16f x3

. list

     +--------------------------------------------------------------+
     |                 x1                   x2                   x3 |
     |--------------------------------------------------------------|
  1. | 1.5000000000000000   1.5000000000000000   1.5000000000000000 |
  2. | 2.1000000000000001   2.0999999046325684   2.1000000000000001 |
  3. | 7.5999999999999996   7.5999999046325684   7.5999999999999996 |
  4. | 0.6000000000000000   0.6000000238418579   0.6000000000000000 |
     +--------------------------------------------------------------+


. assert x3 == x1 // passes

You may want to look through

Code:

help precision

To Nick's point above, notice also that if we set the display format to a small number of decimals, all three sets of numbers look identical:

Code:

. format %2.1f x1 x2 x3

. list

     +-----------------+
     |  x1    x2    x3 |
     |-----------------|
  1. | 1.5   1.5   1.5 |
  2. | 2.1   2.1   2.1 |
  3. | 7.6   7.6   7.6 |
  4. | 0.6   0.6   0.6 |
     +-----------------+

Last edited by Hemanshu Kumar; 17 Jul 2023, 17:11.

Announcement

Stored Values in a New Variable

Comment

Comment

Comment

Comment