Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stored Values in a New Variable

    I have a numeric variable in a dataset (x1) containing integers and numbers with one decimal place.

    I'm creating a new variable (x2) that assumes the same value as x1 by running:

    gen x2 = x1

    When I run this command, some of the stored values in x2 are not exactly the same to some of the values for x1. For example, a value that was originally "7.6" in the original variable x1 is now "7.5999999" in the new variable x2. Another value that was originally "0.6" in x1 is now "0.60000002" in x2.

    Why is this discrepancy happening? Is there something I can do to ensure the values in the two variables are exactly the same?

  • #2
    try

    clonevar x2 = x1

    Comment


    • #3
      That works great. Thank you!

      Comment


      • #4
        The difference you noticed was almost certainly nothing with the values Stata was holding, because it holds numbers in binary. It was almost certainly because the new variable had a different display format. clonevar copies the display format as well as the values.

        Comment


        • #5
          This could also be a precision issue. See the below example:

          Code:
          clear
          input double x1 // note the storage type is double
          1.5
          2.1
          7.6
          0.6
          end
          
          gen x2 = x1 // note that by default Stata generates numeric variables of storage type float
          Now you can see
          Code:
          . assert x2 == x1
          3 contradictions in 4 observations
          assertion is false
          r(9);
          
          
          . format %18.16f x1 x2
          
          . list
          
               +-----------------------------------------+
               |                 x1                   x2 |
               |-----------------------------------------|
            1. | 1.5000000000000000   1.5000000000000000 |
            2. | 2.1000000000000001   2.0999999046325684 |
            3. | 7.5999999999999996   7.5999999046325684 |
            4. | 0.6000000000000000   0.6000000238418579 |
               +-----------------------------------------+
          You can "fix" the problem by doing:
          Code:
          . assert x2 == float(x1) // passes
          
          . gen double x3 = x1 // match the storage type and thus precision of the original variable
          
          . format %18.16f x3
          
          . list
          
               +--------------------------------------------------------------+
               |                 x1                   x2                   x3 |
               |--------------------------------------------------------------|
            1. | 1.5000000000000000   1.5000000000000000   1.5000000000000000 |
            2. | 2.1000000000000001   2.0999999046325684   2.1000000000000001 |
            3. | 7.5999999999999996   7.5999999046325684   7.5999999999999996 |
            4. | 0.6000000000000000   0.6000000238418579   0.6000000000000000 |
               +--------------------------------------------------------------+
          
          
          . assert x3 == x1 // passes
          You may want to look through
          Code:
          help precision
          To Nick's point above, notice also that if we set the display format to a small number of decimals, all three sets of numbers look identical:

          Code:
          . format %2.1f x1 x2 x3
          
          . list
          
               +-----------------+
               |  x1    x2    x3 |
               |-----------------|
            1. | 1.5   1.5   1.5 |
            2. | 2.1   2.1   2.1 |
            3. | 7.6   7.6   7.6 |
            4. | 0.6   0.6   0.6 |
               +-----------------+
          Last edited by Hemanshu Kumar; 17 Jul 2023, 17:11.

          Comment

          Working...
          X