Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata mistakes doing simple arithmetic operations

    I'm trying to create a new id for a dataset by generating a linear combination from the original ID. But, Stata doesn't gen the calculations right.

    The Command was:
    gen new_id=Original_id*13 +3

    Code:
    Original_id      id_13      new_id        _x13          _3
       4041460    52538980    52538984    52538980    52538983
       4041351    52537564    52537568    52537563    52537566
    1002405720 13031274496 13031274496 13031274360 13031274363
    1110596439 14437753856 14437753856 14437753707 14437753710
    1110596439 14437753856 14437753856 14437753707 14437753710
       3736456    48573928    48573932    48573928    48573931
    end

    Above you can ese the Original ID, then multiplying this by 13, and then adding 3. The last two columns are the same operations done by MS Excel. Sometimes, Stata is adding 4 in "new_id", or the multiplication by 13 is wrong, as i've verified this with the display Command.

    thanks to anyone who can help me with this.
    Last edited by Daniel Gamboa-Rinckoar; 08 Jul 2021, 12:45.

  • #2
    This is an appearance of inaccuracy issue. The data are not actually inaccurate, but appear as such because they are displayed as float. Try:

    Code:
    gen double new_id = Original*13+3
    Also see -help precision-, which states:
    Code:
    Justifications for all statements made appear in the sections below. In summary,
    
    1. It sometimes appears that Stata is inaccurate. That is not true and, in fact, the appearance of inaccuracy happens in part because
    Stata is so accurate.
    
    2. You can cover up this appearance of inaccuracy by storing all your data in double precision. This will double (or more) the size of
    your dataset, and so we do not recommend the double-precision solution unless your dataset is small relative to the amount of memory
    on your computer. In that case, there is nothing wrong with storing all your data in double precision.
    
    The easiest way to implement the double-precision solution is by typing set type double. After that, Stata will default to creating
    all new variables as doubles, at least for the remainder of the session. If all your datasets are small relative to the amount of
    memory on your computer, you can set type double, permanently; see [D] generate.
    
    3. The double-precision solution is needlessly wasteful of memory. It is difficult to imagine data that are accurate to more than float
    precision. Regardless of how your data are stored, Stata does all calculations in double precision, and sometimes in quad precision.
    
    The issue of 1.1 not being equal to 1.1 arises only with "nice" decimal numbers. You just have to remember to use Stata's float()
    function when dealing with such numbers.
    And provides more useful information on the issue.
    Last edited by Ali Atia; 08 Jul 2021, 12:49.

    Comment


    • #3
      Stata is as accurate as you tell it to be. Because you told Stata
      Code:
      gen new_id=Original_id*13 +3
      instead of one of the following
      Code:
      gen long new_id=Original_id*13 +3
      gen double new_id=Original_id*13 +3
      Stata used the default storage type of float, which is as discussed in post #2 accurate enough for most tasks, especially since calculations are done using double precision for the intermediate values.

      You had the particular problem of needing to store an integer value precisely, so you need to select your storage type carefully.

      Here are the limits on storage of decimal integers with full accuracy in the various numeric storage types. The fixed-point variables lose the 27 largest positive values to missing value codes; the similar loss for floating point variables occurs only for the largest exponent, so it doesn't affect the much smaller integer values.
      byte - 7 bits -127 100
      int - 15 bits -32,767 32,740
      long - 31 bits -2,147,483,647 2,147,483,620
      float - 24 bits -16,777,216 16,777,216
      double - 53 bits -9,007,199,254,740,992 9,007,199,254,740,992

      Comment


      • #4
        thanks!

        Comment

        Working...
        X