Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • dividing observations into groups

    dear STATA User
    Apologies if this question was already asked, I'm having trouble in dividing my data. This is what I want to do:
    1. I want to divide the variable "PERMNO" set into four equal groups, each group comprising a quarter of the data. The data is divided based on variable "total_assets" from the largest to smallest.
    2. generate new variable, firm_size, with categorical 1,2,3 and 4, based on the four equal group.


    "
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long(total_asset PERMNO)
      1  69
      2  70
      3 441
      4  95
      5  90
      6 464
      7 109
      8  98
      9 224
     10 108
     11 274
     12  65
     13  67
     14 528
     15  45
     16 319
     17 424
     18 284
     19 113
     20   4
     21  93
     22  13
     23 281
     24 499
     25 315
     26 323
     27  25
     28 161
     29 199
     30 290
     31 368
     32 163
     33 188
     34 261
     35 506
     36  37
     37 482
     38 326
     39 311
     40 207
     41 478
     42 237
     43  80
     44 198
     45 485
     46 295
     47 280
     48 401
     49 279
     50 423
     51 362
     52 486
     53  96
     54 400
     55  56
     56  38
     57 136
     58 430
     59 240
     60 197
     61 531
     62 180
     63  23
     64 535
     65 239
     66 371
     67 260
     68 283
     69 480
     70  71
     71 491
     72  75
     73 135
     74 422
     75 292
     76  57
     77 357
     78 426
     79 440
     80 431
     81 417
     82 329
     83 179
     84  55
     85 350
     86 361
     87 351
     88 508
     89 182
     90  86
     91 208
     92 287
     93 174
     94 175
     95  40
     96 231
     97 205
     98 369
     99 509
    100 169
    end
    label values total_asset total_asset
    label def total_asset 1 "--", modify
    label def total_asset 2 "1003644400959490", modify
    label def total_asset 3 "101315624", modify
    label def total_asset 4 "101373000", modify
    label def total_asset 5 "102318460633088", modify
    label def total_asset 6 "1031213481984", modify
    label def total_asset 7 "1037602193408", modify
    label def total_asset 8 "1038706030936060", modify
    label def total_asset 9 "1044159856640", modify
    label def total_asset 10 "1050099200", modify
    label def total_asset 11 "1052131786752", modify
    label def total_asset 12 "105406005248000", modify
    label def total_asset 13 "10587951071232", modify
    label def total_asset 14 "105894649856", modify
    label def total_asset 15 "1063856111616", modify
    label def total_asset 16 "10683437547520", modify
    label def total_asset 17 "107160576", modify
    label def total_asset 18 "10733598277632", modify
    label def total_asset 19 "1075595264", modify
    label def total_asset 20 "1117684864", modify
    label def total_asset 21 "11359505874944", modify
    label def total_asset 22 "11377960812544", modify
    label def total_asset 23 "1142273081344", modify
    label def total_asset 24 "1142882435072", modify
    label def total_asset 25 "1150574788608", modify
    label def total_asset 26 "116375760", modify
    label def total_asset 27 "1175963435008", modify
    label def total_asset 28 "11840059867136", modify
    label def total_asset 29 "118806592", modify
    label def total_asset 30 "119437246464", modify
    label def total_asset 31 "119667792", modify
    label def total_asset 32 "1197796622336", modify
    label def total_asset 33 "1205850341376", modify
    label def total_asset 34 "1209792000", modify
    label def total_asset 35 "121215064", modify
    label def total_asset 36 "1214104469504", modify
    label def total_asset 37 "1218023129088", modify
    label def total_asset 38 "12257391017984", modify
    label def total_asset 39 "1228373622784", modify
    label def total_asset 40 "1229172506624", modify
    label def total_asset 41 "1229677568", modify
    label def total_asset 42 "12469009383424", modify
    label def total_asset 43 "12476256092160", modify
    label def total_asset 44 "1257609887744", modify
    label def total_asset 45 "12596824506368", modify
    label def total_asset 46 "1263115304960", modify
    label def total_asset 47 "1263726886912", modify
    label def total_asset 48 "1270812544", modify
    label def total_asset 49 "12779523145728", modify
    label def total_asset 50 "1288683913216", modify
    label def total_asset 51 "12922422034432", modify
    label def total_asset 52 "129799072", modify
    label def total_asset 53 "1299840106496", modify
    label def total_asset 54 "1300419969024", modify
    label def total_asset 55 "13057548877824", modify
    label def total_asset 56 "130701376", modify
    label def total_asset 57 "1314929508352", modify
    label def total_asset 58 "13185027072", modify
    label def total_asset 59 "1339032469504", modify
    label def total_asset 60 "1342118297600", modify
    label def total_asset 61 "1353634152448", modify
    label def total_asset 62 "1363641630720", modify
    label def total_asset 63 "136618852352", modify
    label def total_asset 64 "13734267322368", modify
    label def total_asset 65 "1381633359872", modify
    label def total_asset 66 "138196344111104", modify
    label def total_asset 67 "139701312", modify
    label def total_asset 68 "139809128448", modify
    label def total_asset 69 "14019294396416", modify
    label def total_asset 70 "140246496", modify
    label def total_asset 71 "1403237302272", modify
    label def total_asset 72 "14080570032128", modify
    label def total_asset 73 "1412285857792", modify
    label def total_asset 74 "1415136018432", modify
    label def total_asset 75 "14157428555776", modify
    label def total_asset 76 "14207414173696", modify
    label def total_asset 77 "1423242559488", modify
    label def total_asset 78 "14276198662144", modify
    label def total_asset 79 "1443540369408", modify
    label def total_asset 80 "1451906826240", modify
    label def total_asset 81 "145224416", modify
    label def total_asset 82 "14540108464128", modify
    label def total_asset 83 "14610863226880", modify
    label def total_asset 84 "14612273561600", modify
    label def total_asset 85 "14630080479232", modify
    label def total_asset 86 "147254256", modify
    label def total_asset 87 "14926224556032", modify
    label def total_asset 88 "149708177408", modify
    label def total_asset 89 "1504668416", modify
    label def total_asset 90 "1508215552", modify
    label def total_asset 91 "150879420416", modify
    label def total_asset 92 "15226009288704", modify
    label def total_asset 93 "1531365556224", modify
    label def total_asset 94 "1538089811968", modify
    label def total_asset 95 "1543216300032", modify
    label def total_asset 96 "1547569987584", modify
    label def total_asset 97 "1558285647872", modify
    label def total_asset 98 "1562589528064", modify
    label def total_asset 99 "1570230763520", modify
    label def total_asset 100 "1575467876352", modify
    label values PERMNO PERMNO
    label def PERMNO 4 "ABMM IJ Equity", modify
    label def PERMNO 13 "AGRO IJ Equity", modify
    label def PERMNO 23 "ALKA IJ Equity", modify
    label def PERMNO 25 "ALTO IJ Equity", modify
    label def PERMNO 37 "APOL IJ Equity", modify
    label def PERMNO 38 "ARGO IJ Equity", modify
    label def PERMNO 40 "ARNA IJ Equity", modify
    label def PERMNO 45 "ASDM IJ Equity", modify
    label def PERMNO 55 "AUTO IJ Equity", modify
    label def PERMNO 56 "BABP IJ Equity", modify
    label def PERMNO 57 "BACA IJ Equity", modify
    label def PERMNO 65 "BBKP IJ Equity", modify
    label def PERMNO 67 "BBMD IJ Equity", modify
    label def PERMNO 69 "BBNP IJ Equity", modify
    label def PERMNO 70 "BBRI IJ Equity", modify
    label def PERMNO 71 "BBRM IJ Equity", modify
    label def PERMNO 75 "BCIC IJ Equity", modify
    label def PERMNO 80 "BFIN IJ Equity", modify
    label def PERMNO 86 "BIPI IJ Equity", modify
    label def PERMNO 90 "BJBR IJ Equity", modify
    label def PERMNO 93 "BKSL IJ Equity", modify
    label def PERMNO 95 "BLTA IJ Equity", modify
    label def PERMNO 96 "BLTZ IJ Equity", modify
    label def PERMNO 98 "BMRI IJ Equity", modify
    label def PERMNO 108 "BORN IJ Equity", modify
    label def PERMNO 109 "BPFI IJ Equity", modify
    label def PERMNO 113 "BRMS IJ Equity", modify
    label def PERMNO 135 "CEKA IJ Equity", modify
    label def PERMNO 136 "CENT IJ Equity", modify
    label def PERMNO 161 "DILD IJ Equity", modify
    label def PERMNO 163 "DLTA IJ Equity", modify
    label def PERMNO 169 "DPUM IJ Equity", modify
    label def PERMNO 174 "DVLA IJ Equity", modify
    label def PERMNO 175 "DYAN IJ Equity", modify
    label def PERMNO 179 "ELTY IJ Equity", modify
    label def PERMNO 180 "EMDE IJ Equity", modify
    label def PERMNO 182 "ENRG IJ Equity", modify
    label def PERMNO 188 "ETWA IJ Equity", modify
    label def PERMNO 197 "GAMA IJ Equity", modify
    label def PERMNO 198 "GDST IJ Equity", modify
    label def PERMNO 199 "GDYR IJ Equity", modify
    label def PERMNO 205 "GLOB IJ Equity", modify
    label def PERMNO 207 "GMTD IJ Equity", modify
    label def PERMNO 208 "GOLD IJ Equity", modify
    label def PERMNO 224 "HOTL IJ Equity", modify
    label def PERMNO 231 "IDPR IJ Equity", modify
    label def PERMNO 237 "IMJS IJ Equity", modify
    label def PERMNO 239 "INAF IJ Equity", modify
    label def PERMNO 240 "INAI IJ Equity", modify
    label def PERMNO 260 "ITMA IJ Equity", modify
    label def PERMNO 261 "ITMG IJ Equity", modify
    label def PERMNO 274 "JTPE IJ Equity", modify
    label def PERMNO 279 "KBLV IJ Equity", modify
    label def PERMNO 280 "KBRI IJ Equity", modify
    label def PERMNO 281 "KDSI IJ Equity", modify
    label def PERMNO 283 "KICI IJ Equity", modify
    label def PERMNO 284 "KIJA IJ Equity", modify
    label def PERMNO 287 "KLBF IJ Equity", modify
    label def PERMNO 290 "KONI IJ Equity", modify
    label def PERMNO 292 "KPIG IJ Equity", modify
    label def PERMNO 295 "KREN IJ Equity", modify
    label def PERMNO 311 "LPPS IJ Equity", modify
    label def PERMNO 315 "MAGP IJ Equity", modify
    label def PERMNO 319 "MAPI IJ Equity", modify
    label def PERMNO 323 "MBAP IJ Equity", modify
    label def PERMNO 326 "MCOR IJ Equity", modify
    label def PERMNO 329 "MDLN IJ Equity", modify
    label def PERMNO 350 "MNCN IJ Equity", modify
    label def PERMNO 351 "MPMX IJ Equity", modify
    label def PERMNO 357 "MTFN IJ Equity", modify
    label def PERMNO 361 "MYOH IJ Equity", modify
    label def PERMNO 362 "MYOR IJ Equity", modify
    label def PERMNO 368 "NIKL IJ Equity", modify
    label def PERMNO 369 "NIPS IJ Equity", modify
    label def PERMNO 371 "NISP IJ Equity", modify
    label def PERMNO 400 "PORT IJ Equity", modify
    label def PERMNO 401 "POWR IJ Equity", modify
    label def PERMNO 417 "RAJA IJ Equity", modify
    label def PERMNO 422 "RELI IJ Equity", modify
    label def PERMNO 423 "RICY IJ Equity", modify
    label def PERMNO 424 "RIGS IJ Equity", modify
    label def PERMNO 426 "RMBA IJ Equity", modify
    label def PERMNO 430 "SAFE IJ Equity", modify
    label def PERMNO 431 "SAME IJ Equity", modify
    label def PERMNO 440 "SHID IJ Equity", modify
    label def PERMNO 441 "SHIP IJ Equity", modify
    label def PERMNO 464 "SONA IJ Equity", modify
    label def PERMNO 478 "SUGI IJ Equity", modify
    label def PERMNO 480 "SUPR IJ Equity", modify
    label def PERMNO 482 "TARA IJ Equity", modify
    label def PERMNO 485 "TBLA IJ Equity", modify
    label def PERMNO 486 "TBMS IJ Equity", modify
    label def PERMNO 491 "TIFA IJ Equity", modify
    label def PERMNO 499 "TMPI IJ Equity", modify
    label def PERMNO 506 "TPMA IJ Equity", modify
    label def PERMNO 508 "TRIL IJ Equity", modify
    label def PERMNO 509 "TRIM IJ Equity", modify
    label def PERMNO 528 "WAPO IJ Equity", modify
    label def PERMNO 531 "WIIM IJ Equity", modify
    label def PERMNO 535 "WSBP IJ Equity", modify
    "



    Thank you very much for you time,

    Anzas

  • #2
    In your example dataset there are no missing values and all values of total assets are distinct. So, this will work:


    Code:
    . sort total_asset
    
    . gen group = ceil(4 * _n/_N)
    
    . tab group
    
          group |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              1 |         25       25.00       25.00
              2 |         25       25.00       50.00
              3 |         25       25.00       75.00
              4 |         25       25.00      100.00
    ------------+-----------------------------------
          Total |        100      100.00
    
    . scatter total_asset group
    but for a more general method check out fastxtile by Michael Stepner from SSC.

    Comment


    • #3
      Thanks Nick Cox , it's solve my problem.

      Comment


      • #4
        Originally posted by Nick Cox View Post
        In your example dataset there are no missing values and all values of total assets are distinct. So, this will work:


        Code:
        . sort total_asset
        
        . gen group = ceil(4 * _n/_N)
        
        . tab group
        
        group | Freq. Percent Cum.
        ------------+-----------------------------------
        1 | 25 25.00 25.00
        2 | 25 25.00 50.00
        3 | 25 25.00 75.00
        4 | 25 25.00 100.00
        ------------+-----------------------------------
        Total | 100 100.00
        
        . scatter total_asset group
        but for a more general method check out fastxtile by Michael Stepner from SSC.
        Hi Nick,

        I found sometimes the ceil function is not that accurate, especially for large numbers of observations (e.g., > 30000 obs).

        For example, you can take a look at the following codes:
        Code:
        clear
        use auto.dta
        
        * obs: 74
        
        expand 4
        * expand obs to 296
        
        
        expand 296
        * expand to 296 * 296 observations
        
        gen group = ceil(296 * _n/_N)
        
        tab group
        You can see that some groups have 295, and some have 297.

        Comment


        • #5
          #4 is a salutary example. But don't blame ceil(): it works with what it is given. Every now and again what is given it is a smidgen out and (the problem) ceil() rounds the wrong way.

          This is a way to explore the problem, which is our friend precision again.

          In every observation that has observation number 296(296)87616 the answer should be an exact integer, If the result is a little too low, ceil() rounds up as elsewhere intended, but if it is a little too high ceil() rounds up by almost 1 and gives the wrong answer.

          Code:
          . clear
          
          . set obs `=296^2'
          number of observations (_N) was 0, now 87,616
          
          . gen double foo = 296 * _n/_N
          
          . format foo %21x
          
          . l foo if foo != round(foo, 1) & mod(_n, 296) == 0
          
                 +-----------------------+
                 |                   foo |
                 |-----------------------|
          11544. | +1.37fffffffffffX+005 |
          12728. | +1.57fffffffffffX+005 |
          23088. | +1.37fffffffffffX+006 |
          25456. | +1.57fffffffffffX+006 |
          29304. | +1.8c00000000001X+006 |
                 |-----------------------|
          31672. | +1.ac00000000001X+006 |
          34040. | +1.cbfffffffffffX+006 |
          36408. | +1.ebfffffffffffX+006 |
          46176. | +1.37fffffffffffX+007 |
          47656. | +1.4200000000001X+007 |
                 |-----------------------|
          50912. | +1.57fffffffffffX+007 |
          52392. | +1.6200000000001X+007 |
          57128. | +1.81fffffffffffX+007 |
          58608. | +1.8c00000000001X+007 |
          61864. | +1.a1fffffffffffX+007 |
                 |-----------------------|
          63344. | +1.ac00000000001X+007 |
          68080. | +1.cbfffffffffffX+007 |
          69560. | +1.d600000000001X+007 |
          72816. | +1.ebfffffffffffX+007 |
          74296. | +1.f600000000001X+007 |
                 +-----------------------+

          Comment

          Working...
          X