assert command

River Huang

Join Date: Mar 2016

Posts: 1908
#1

assert command

12 Aug 2018, 05:36

Dear All, Suppose that I have the following data

Code:

clear input float(id ratiopas) double ratiopa 1 .023817766 .02381777 2 .017478175 .01747818 3 .10745806 .10745806 4 .23994523 .23994512 5 .018241132 .01824113 6 .044537 .044537 7 .04022434 .04022434 8 .02846349 .02846349 9 .0898099 .08980991 10 .013123708 .01312371 end

How can I change the data format so that I can use assert to check whether ratiopas and ratiopa are identical? Thanks.

Ho-Chuan (River) Huang
Stata 19.0, MP(4)
Tags: None

Nick Cox

Join Date: Mar 2014
Posts: 35711

12 Aug 2018, 05:51

Format is an overloaded word in computing. The display format of either variable is irrelevant to comparing their values, as is the data layout. So, I am not clear what you want beyond a guess that this may help:

Code:

. clear

. set obs 10
number of observations (_N) was 0, now 10

. gen double foo1 = 1/_n

. gen foo2 = 1/_n

. list

     +----------------------+
     |      foo1       foo2 |
     |----------------------|
  1. |         1          1 |
  2. |        .5         .5 |
  3. | .33333333   .3333333 |
  4. |       .25        .25 |
  5. |        .2         .2 |
     |----------------------|
  6. | .16666667   .1666667 |
  7. | .14285714   .1428571 |
  8. |      .125       .125 |
  9. | .11111111   .1111111 |
 10. |        .1         .1 |
     +----------------------+

. assert foo1 == foo2
6 contradictions in 10 observations
assertion is false
r(9);

. assert float(foo1) == foo2

With assert no news is good news.

Comment

River Huang

Join Date: Mar 2016
Posts: 1908

12 Aug 2018, 17:59

Dear Nick, Many thanks for the reply.

In fact, I am (the question is from my friend) trying to generate a dummy, say `d', which is equal to 1 if ratiopas==ratiopa, 0 otherwise.

Even when I do the following

Code:

clear
	input float(id ratiopas) double ratiopa
	 1 .023817766 .02381777
	 2 .017478175 .01747818
	 3  .10745806 .10745806
	 4  .23994523 .23994512
	 5 .018241132 .01824113
	 6    .044537   .044537
	 7  .04022434 .04022434
	 8  .02846349 .02846349
	 9   .0898099 .08980991
	10 .013123708 .01312371
	end
	
	format ratiopas ratiopa %10.8f
	gen d = (ratiopas == float(ratiopa))
	format *

I find that, for example, the first observations of ratiopas and ratiopa are not equal (they are supposed to be equal).

Code:

. list
	
	     +----------------------------------+
	     | id     ratiopas      ratiopa   d |
	     |----------------------------------|
	  1. |  1   0.02381777   0.02381777   0 |
	  2. |  2   0.01747818   0.01747818   0 |
	  3. |  3   0.10745806   0.10745806   1 |
	  4. |  4   0.23994523   0.23994512   0 |
	  5. |  5   0.01824113   0.01824113   0 |
	     |----------------------------------|
	  6. |  6   0.04453700   0.04453700   1 |
	  7. |  7   0.04022434   0.04022434   1 |
	  8. |  8   0.02846349   0.02846349   1 |
	  9. |  9   0.08980990   0.08980991   0 |
	 10. | 10   0.01312371   0.01312371   0 |
	     +----------------------------------+

I conjecture the problem is due to the different types and formats of the variables? Any suggestion is highly appreciated.

Ho-Chuan (River) Huang
Stata 19.0, MP(4)

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30115
#4

12 Aug 2018, 18:31

So ratiopas has float storage type, and ratiopa has double storage type. The expression ratiopas == float(ratiopa) corresponds to whether or not ratiopa and ratiopas agree when ratiopa is truncated to single precision. Single precision (float) corresponds approximately to 7 significant figures. You can see in your -dataex- in #3, which shows the numbers to 8 significant figures, that in several observations the numbers are clearly not equal, even though, when rounded to only 7 significant figures in the -list- output, they appear to be equal. These are very small differences, probably too small to be of any practical importance in most contexts. But they are just at the threshold of what can be expressed in float precision.

If ratiopas and ratiopa were separately calculated earlier in the code, or obtained from different data sources, it is to be expected that there will be differences of this magnitude, even if they come from formulas that algebraically should be equal.
I think the only workable solution to this situation is to choose some maximum allowable difference between the two variables (say, 10^-8) and then define this variable d in terms of that difference:

Code:

gen d = abs(ratiopas - ratiopa) < 1e-8
2 likes
Comment
River Huang

Join Date: Mar 2016

Posts: 1908
#5

12 Aug 2018, 18:42

Dear Clyde, Thank you so much.

Ho-Chuan (River) Huang
Stata 19.0, MP(4)
Comment

Announcement

Comment

Comment

Comment

Comment