Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • random selection between observation

    Hi everyone,
    In the below dataset, I want to randomly delete one of the observations that has the same wage for a given year. For example, I would be happy if Stata randomly drops business a or b for year 93. Do you have any idea?
    thanks a lot.

    year wage business
    93 100 a
    93 100 b
    94 150 c
    95 200 d
    95 250 e
    96 300 f


  • #2
    Code:
    set seed 357993207                     // <= important
    generate double randu = runiform()
    isid randu                             // <= also important if large dataset
    bysort year wage (randu): keep if _n == 1
    alternatively
    Code:
    set seed 928467163
    generate double randu = runiform()
    isid year wage randu, sort
    by year wage:  keep if _n == 1

    Comment


    • #3
      Thank you so much. Another question. For example, Iook at the table below. in the 93, because I have same wage for same year, I would like to drop the observation which individual didn't work in the previous year. I mean, I want to keep a business obs for the 93. But for the 94, it should be random as it was before.
      many thanks.
      year wage business
      92 100 a
      93 150 a
      93 150 b
      94 200 c
      94 200 d
      94 200 h
      95 300 e
      96 350 f
      96 400 g

      Comment

      Working...
      X