Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • gen id = _n

    Dear All, Consider the following case,
    Code:
    clear
    set seed 12345
    set obs 20000000
    gen id = _n
    We can find that, say 1,999,925th to 1,999,929th observations do not have the corresponding `id' number. Any suggestions? Thanks.
    Ho-Chuan (River) Huang
    Stata 19.0, MP(4)

  • #2
    I can't replicate this problem:

    Code:
    clear
    set seed 12345
    set obs 20000000
    gen id = _n
    
    
    . list in 1999925/1999929
    
             +---------+
             |      id |
             |---------|
    1999925. | 1999925 |
    1999926. | 1999926 |
    1999927. | 1999927 |
    1999928. | 1999928 |
    1999929. | 1999929 |
             +---------+
    Note that if your default numeric variable data type is not double, you will lose precision after 7 digits. To illustrate:

    Code:
    clear
    set seed 12345
    set obs 20000000
    gen float f_id = _n
    gen double d_id = _n
    
    . count if f_id!=_n
      1,611,392
    
    . count if d_id!=_n
      0
    So you may be looking for:

    Code:
    gen double id = _n
    Last edited by Ali Atia; 21 Jun 2022, 19:16.

    Comment


    • #3
      Dear Ali, Many thanks for these helpful suggestions.
      Ho-Chuan (River) Huang
      Stata 19.0, MP(4)

      Comment


      • #4
        Ali Atia 's advice is good. The optimal way to go about this is to use the system scalar -c(obs_t)-. From -help creturn- :

        Code:
            c(obs_t) returns a string equal to the optimal data type for storing _n.  This allows you to code
        
                    generate `c(obs_t)' index = _n
        
                and know that index will go from 1 to _N without roundoff errors and without wasting any space.
        So all you would need is

        Code:
        gen `c(obs_t)' id = _n

        Comment

        Working...
        X