Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • assert command

    Dear All, Suppose that I have the following data
    Code:
    clear
    input float(id ratiopas) double ratiopa
     1 .023817766 .02381777
     2 .017478175 .01747818
     3  .10745806 .10745806
     4  .23994523 .23994512
     5 .018241132 .01824113
     6    .044537   .044537
     7  .04022434 .04022434
     8  .02846349 .02846349
     9   .0898099 .08980991
    10 .013123708 .01312371
    end
    How can I change the data format so that I can use assert to check whether ratiopas and ratiopa are identical? Thanks.
    Ho-Chuan (River) Huang
    Stata 15.1, MP(4)

  • #2
    Format is an overloaded word in computing. The display format of either variable is irrelevant to comparing their values, as is the data layout. So, I am not clear what you want beyond a guess that this may help:


    Code:
    . clear
    
    . set obs 10
    number of observations (_N) was 0, now 10
    
    . gen double foo1 = 1/_n
    
    . gen foo2 = 1/_n
    
    . list
    
         +----------------------+
         |      foo1       foo2 |
         |----------------------|
      1. |         1          1 |
      2. |        .5         .5 |
      3. | .33333333   .3333333 |
      4. |       .25        .25 |
      5. |        .2         .2 |
         |----------------------|
      6. | .16666667   .1666667 |
      7. | .14285714   .1428571 |
      8. |      .125       .125 |
      9. | .11111111   .1111111 |
     10. |        .1         .1 |
         +----------------------+
    
    . assert foo1 == foo2
    6 contradictions in 10 observations
    assertion is false
    r(9);
    
    . assert float(foo1) == foo2
    With assert no news is good news.

    Comment


    • #3
      Dear Nick, Many thanks for the reply.
      1. In fact, I am (the question is from my friend) trying to generate a dummy, say `d', which is equal to 1 if ratiopas==ratiopa, 0 otherwise.
      2. Even when I do the following
        Code:
        clear
        	input float(id ratiopas) double ratiopa
        	 1 .023817766 .02381777
        	 2 .017478175 .01747818
        	 3  .10745806 .10745806
        	 4  .23994523 .23994512
        	 5 .018241132 .01824113
        	 6    .044537   .044537
        	 7  .04022434 .04022434
        	 8  .02846349 .02846349
        	 9   .0898099 .08980991
        	10 .013123708 .01312371
        	end
        	
        	format ratiopas ratiopa %10.8f
        	gen d = (ratiopas == float(ratiopa))
        	format *
        I find that, for example, the first observations of ratiopas and ratiopa are not equal (they are supposed to be equal).
        Code:
        . list
        	
        	     +----------------------------------+
        	     | id     ratiopas      ratiopa   d |
        	     |----------------------------------|
        	  1. |  1   0.02381777   0.02381777   0 |
        	  2. |  2   0.01747818   0.01747818   0 |
        	  3. |  3   0.10745806   0.10745806   1 |
        	  4. |  4   0.23994523   0.23994512   0 |
        	  5. |  5   0.01824113   0.01824113   0 |
        	     |----------------------------------|
        	  6. |  6   0.04453700   0.04453700   1 |
        	  7. |  7   0.04022434   0.04022434   1 |
        	  8. |  8   0.02846349   0.02846349   1 |
        	  9. |  9   0.08980990   0.08980991   0 |
        	 10. | 10   0.01312371   0.01312371   0 |
        	     +----------------------------------+
      3. I conjecture the problem is due to the different types and formats of the variables? Any suggestion is highly appreciated.
      Ho-Chuan (River) Huang
      Stata 15.1, MP(4)

      Comment


      • #4
        So ratiopas has float storage type, and ratiopa has double storage type. The expression ratiopas == float(ratiopa) corresponds to whether or not ratiopa and ratiopas agree when ratiopa is truncated to single precision. Single precision (float) corresponds approximately to 7 significant figures. You can see in your -dataex- in #3, which shows the numbers to 8 significant figures, that in several observations the numbers are clearly not equal, even though, when rounded to only 7 significant figures in the -list- output, they appear to be equal. These are very small differences, probably too small to be of any practical importance in most contexts. But they are just at the threshold of what can be expressed in float precision.

        If ratiopas and ratiopa were separately calculated earlier in the code, or obtained from different data sources, it is to be expected that there will be differences of this magnitude, even if they come from formulas that algebraically should be equal.
        I think the only workable solution to this situation is to choose some maximum allowable difference between the two variables (say, 10-8) and then define this variable d in terms of that difference:
        Code:
        gen d = abs(ratiopas - ratiopa) < 1e-8

        Comment


        • #5
          Dear Clyde, Thank you so much.
          Ho-Chuan (River) Huang
          Stata 15.1, MP(4)

          Comment

          Working...
          X