gen id = _n

River Huang

Join Date: Mar 2016

Posts: 1908
#1

gen id = _n

21 Jun 2022, 19:01

Dear All, Consider the following case,

Code:

clear set seed 12345 set obs 20000000 gen id = _n

We can find that, say 1,999,925th to 1,999,929th observations do not have the corresponding `id' number. Any suggestions? Thanks.

Ho-Chuan (River) Huang
Stata 19.0, MP(4)
Tags: None

Ali Atia

Join Date: May 2020
Posts: 737

21 Jun 2022, 19:08

I can't replicate this problem:

Code:

clear
set seed 12345
set obs 20000000
gen id = _n


. list in 1999925/1999929

         +---------+
         |      id |
         |---------|
1999925. | 1999925 |
1999926. | 1999926 |
1999927. | 1999927 |
1999928. | 1999928 |
1999929. | 1999929 |
         +---------+

Note that if your default numeric variable data type is not double, you will lose precision after 7 digits. To illustrate:

Code:

clear
set seed 12345
set obs 20000000
gen float f_id = _n
gen double d_id = _n

. count if f_id!=_n
  1,611,392

. count if d_id!=_n
  0

So you may be looking for:

Code:

gen double id = _n

Last edited by Ali Atia; 21 Jun 2022, 19:16.

Comment

River Huang

Join Date: Mar 2016

Posts: 1908
#3

21 Jun 2022, 19:20

Dear Ali, Many thanks for these helpful suggestions.

Ho-Chuan (River) Huang
Stata 19.0, MP(4)
Comment

Leonardo Guizzetti

Join Date: Jul 2016
Posts: 2457

21 Jun 2022, 19:39

Ali Atia 's advice is good. The optimal way to go about this is to use the system scalar -c(obs_t)-. From -help creturn- :

Code:

    c(obs_t) returns a string equal to the optimal data type for storing _n.  This allows you to code

            generate `c(obs_t)' index = _n

        and know that index will go from 1 to _N without roundoff errors and without wasting any space.

So all you would need is

Code:

gen `c(obs_t)' id = _n

Announcement

Comment

Comment

Comment