Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Rowmin and rounding issue

    Hi all! I am sorry for asking this basic question, but I am using egen rowmin to select the minimum value between two variables (v1 and v2). The results from rowmin are not EXACTLY the same as the variable with the minimum value for ALL the observations, with decimal points being the main problem. For example, v1 is 3.916571 while the min value is 3.9165709. I tried the following syntax, but it does not work. I would appreciate your help with this. Thank you!!

    *///to simplify, I am rounding my v1 and v2 before creating my min variable.
    replace v1=round(v1, 0.000001)
    replace v2=round(v2, 0.000001)
    egen min=rowmin(v1 v2)
    *///because both v1 and v2 are double with %10.0g format, I am trying to recast and reformat the min variable.
    recast double min
    format %10.0g min
    replace min=round(min, 0.000001)

  • #2
    This is a precision issue on several levels. You need to start out generating doubles to preserve detail. A recast double will never restore detail lost in earlier calculations; that is like trying to get spilled milk back into the bottle or toothpaste back into the tube (indeed harder).

    Rounding to fractions that are powers of 10 is ultimately either doomed or irrelevant, as almost none of them can be held exactly as binary approximations. Here is what 1 in a million looks like inside Stata, more nearly, (except that this is a decimal display of a number held in binary as an approximation)

    . di %27.26f 1e-6
    0.00000099999999999999995475


    It's not exactly 1 in a million and can't be.

    Backing up: If v1 v2 were float then the minimum could be just

    Code:
    gen min = min(v1, v2)
    but as both are double you need

    Code:
    gen double min  = min(v1, v2)
    and any other approach is wrong or round-about.

    egen should do something similar, so long as you insist on double too, but is just a long-winded alternative in this case. Looking inside with


    Code:
    viewsource egen.ado 
    viewsource _growmin.ado
    will show that you are invoking a lot of code to do something that generate will do in one line.

    And naturally comparison should be based on assigning the same display format.

    Comment


    • #3
      Once you have created a value as float, using recast to convert it to double cannot restore the precision that was lost when the result of Stata's calculations (which are always done in double) was shortened to fit in a float. Instead, don't do any rounding, and replace all the code above with
      Code:
      egen double min=rowmin(v1 v2) 
      format %10.0g min
      See the output of help precision for more about these issues.

      Comment


      • #4
        Perfect! Thank you both so much for your response and help!

        Comment


        • #5
          The new command round_exact (available on SSC) will solve this issue. Try this:

          clear
          input double price
          0.3
          0.105
          2.675
          1.005
          0.005
          end

          round_exact1 price, d(2) generate(price_r)
          assert price_r[1] == 0.30
          assert price_r[2] == 0.11
          assert price_r[3] == 2.68
          assert price_r[4] == 1.01
          assert price_r[5] == 0.01

          Comment

          Working...
          X