Select numbers with decimals

Lydia Palumbo

Join Date: Jun 2016
Posts: 81

Select numbers with decimals

01 Jan 2017, 21:42

Dear all,

I have a numerical variable regarding duration whose integer values are both followed by digits (the value is 0.5) and without, as shown in the example. They identify durations in months:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double(stop1_panel stop2_panel)
  634 .
  634 .
  634 .
  634 .
  634 .
  634 .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
    . .
629.5 .
629.5 .
629.5 .
629.5 .
629.5 .
629.5 .
605.5 .
605.5 .
605.5 .
end

I need to select only those variables that end with .5. I cannot with substr due to the format. Could you please help me?
Thank you and best,
Lydia

Tags: None

Red Owl

Join Date: Nov 2016

Posts: 127
#2

01 Jan 2017, 22:23

Lydia,

Assuming you want to drop all integers (including those with 0.5 remainders), you could use Stata's mod() function.

Code:

clear input double(stop1_panel stop2_panel) 634 . 634 . 634 . 634 . 634 . 634 . 629.5 . 629.5 . 629.5 . 629.5 . 629.5 . 629.5 . 605.5 . 605.5 . 605.5 . end keep if mod(stop1_panel,1)>0 list _all

If you want to keep only those that have exactly a 0.5 remainder, you could use

Code:

keep if mod(stop1_panel,1)==0.5

Red Owl
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#3

02 Jan 2017, 06:06

A more general, and more robust, approach to retaining only those observations that have no fractional part would be

Code:

keep if int(stop1_panel)==stop1_panel

See the discussion in the output of help precision for more about the difficulties in dealing with exact comparisons of numbers containing fractional parts.
Comment
Red Owl

Join Date: Nov 2016

Posts: 127
#4

02 Jan 2017, 12:43

William Lisowski I understand and appreciate that we have to consider precision in trying to identify integers, but the following three approaches seem to produce the same results in my toy data set even when one of the observations has the double precision value of 3.000000000000001.

Code:

clear input double(testvar) 3 3.5 3.000000000000001 end format testvar %16.15f list _all * Approach 1 list if mod(testvar,1) == 0 * Approach 2 list if int(testvar) == testvar * Approach 3 list if testvar == floor(testvar)

Would you offer an example value of testvar for which Approach 2 or 3 in the code above would produce a different result from Approach 1?

Thanks.

Red Owl
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#5

02 Jan 2017, 13:19

Your example in #4 succeeds because all your comparisons are to whole numbers without fractional parts, which was not what I warned about in #3. And your example in #2 succeeds because the fractional part (1/2) can be represented as a terminating binary fraction (0.1 base 2). Consider the following variant of the example you provided in #2, where the fractional part (3/10) is a repeating binary fraction (0.011011011... base 2).

Code:

. clear . input double(testvar) testvar 1. 634 2. 629.3 3. end . list _all, clean testvar 1. 634 2. 629.3 . list if mod(testvar,1) != .3, clean testvar 1. 634 2. 629.3 . format testvar %21x . list if mod(testvar,1) != .3, clean testvar 1. +1.3d00000000000X+009 2. +1.3aa6666666666X+009 .

I won't go further into this here; the combination of help precision and the blog and FAQ entries surfaced by search precision go into this topic in agonizing detail.
Comment
Red Owl

Join Date: Nov 2016

Posts: 127
#6

02 Jan 2017, 15:13

William Lisowski Thanks. That's very helpful.

Red Owl
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35711
#7

03 Jan 2017, 01:05

Note that the original idea of using string functions is not out of court. The (display) format is irrelevant and even the variable or storage type is no barrier to string manipulations on a string version of the variable:

Code:

keep if substr(string(stop1_panel, "%2.1f"), -2, 2) == ".5"
1 like
Comment
Lydia Palumbo

Join Date: Jun 2016

Posts: 81
#8

03 Jan 2017, 06:53

Thank you for the help. I managed to sort out my problem.
Comment

Announcement

Select numbers with decimals

Comment

Comment

Comment

Comment

Comment

Comment

Comment