Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem in counting

    Dear Profs and Colleagues,

    There are variables:

    1- year: 2010-2019
    2- nacio: nationalities
    3- Experience groups: Expgroup
    4- Skill retio: sk_rat_quintile
    5- nacional: if nacio="PT"=1 Otherwise 0 (I used gen nacioanal=nacio=="PT")

    I am going to count the number of nacional which is 1 or 0 in a combination of "Expgroup sk_rat_quintile wieght ANO ". To be preseicely, I need to know how many 1 exist i.e in Experience group 1 and skill ratio 1. I don't know where is the problem. Why this codes does not count 1 !!!

    As you can see there are around 10 million 1 in nacioanal while it counts vise versa with this code.
    Code:
     tab nacioanal
    
      nacioanal |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |    469,441        4.69        4.69
              1 |  9,548,256       95.31      100.00
    ------------+-----------------------------------
          Total | 10,017,697      100.00
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double year str6 nacio float Expgroup byte sk_rat_quintile
    2010 "PT" 2 3
    2010 "PT" 5 3
    2010 "PT" 7 3
    2010 "PT" 7 3
    2010 "PT" 7 3
    2010 "PT" 7 3
    2010 "PT" 5 3
    2010 "PT" 3 3
    2010 "PT" 6 3
    2010 "PT" 7 3
    2010 "PT" 6 3
    2010 "PT" 7 3
    2010 "PT" 4 3
    2010 "PT" 7 3
    2010 "PT" 5 3
    2010 "PT" 2 3
    2010 "PT" 6 3
    2010 "PT" 5 3
    2010 "PT" 4 3
    2010 "PT" 7 3
    2010 "PT" 4 3
    2010 "PT" 8 3
    2010 "PT" 7 3
    2010 "PT" 5 3
    2010 "PT" 5 3
    2010 "PT" 4 3
    2010 "PT" 6 3
    2010 "PT" 8 3
    2010 "PT" 6 3
    2010 "PT" 6 3
    2010 "PT" 8 3
    2010 "PT" 6 3
    2010 "PT" 4 3
    2010 "PT" 4 3
    2010 "PT" 4 3
    2010 "PT" 3 3
    2010 "PT" 5 3
    2010 "PT" 7 3
    2010 "PT" 7 3
    2010 "PT" 2 3
    2010 "PT" 5 3
    2010 "PT" 4 3
    2010 "PT" 3 3
    2010 "PT" 7 3
    2010 "PT" 4 3
    2010 "PT" 8 3
    2010 "PT" 4 3
    2010 "PT" 6 3
    2010 "PT" 8 3
    2010 "PT" 4 3
    2010 "PT" 8 3
    2010 "PT" 4 3
    2010 "PT" 3 3
    2010 "PT" 8 3
    2010 "PT" 7 3
    2010 "PT" 6 3
    2010 "PT" 2 3
    2010 "PT" 7 3
    2010 "PT" 6 3
    2010 "PT" 7 3
    2010 "PT" 8 3
    2010 "PT" 3 3
    2010 "PT" 7 3
    2010 "PT" 6 3
    2010 "PT" 7 3
    2010 "PT" 6 3
    2010 "PT" 8 3
    2010 "PT" 4 3
    2010 "PT" 7 3
    2010 "PT" 7 3
    2010 "AO" 5 3
    2010 "PT" 2 3
    2010 "PT" 6 3
    2010 "PT" 5 3
    2010 "ES" 6 3
    2010 "PT" 5 3
    2010 "PT" 2 3
    2010 "PT" 8 3
    2010 "PT" 7 3
    2010 "PT" 7 3
    2010 "PT" 4 3
    2010 "PT" 7 3
    2010 "PT" 7 3
    2010 "PT" 8 3
    2010 "PT" 4 3
    2010 "PT" 5 3
    2010 "PT" 4 3
    2010 "PT" 4 3
    2010 "PT" 6 3
    2010 "PT" 3 3
    2010 "PT" 5 3
    2010 "PT" 6 3
    2010 "PT" 5 3
    2010 "PT" 2 3
    2010 "PT" 7 3
    2010 "PT" 5 3
    2010 "PT" 3 3
    2010 "PT" 5 3
    2010 "PT" 5 3
    2010 "PT" 8 3
    end
    Code:
    egen wieght= count ( nacio), by( Expgroup sk_rat_quintile)
    collapse (sum) nacioanal , by( Expgroup  sk_rat_quintile wieght  year )
    drop if Expgroup==.
    rename nacioanal wo_nat
    gen wo_for= wieght - wo_nat
    gen shr_immg=wo_for/(weight)
    My expectation is shr_immg should be 0.05 while now it is 0.9. You can see below.


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float wieght double wo_nat float(wo_for shr_immg)
    218383 24278 194105 .8888283
    218383 21758 196625 .9003677
    218383 17308 201075 .9207447
    218383 16798 201585 .9230801
    218383 18680 199703 .9144622
    218383 19054 199329 .9127496
    218383 20321 198062 .9069479
    218383 22590 195793 .8965579
    218383 24350 194033 .8884987
    218383 23610 194773 .8918872
     95775  9439  86336 .9014461
     95775  8726  87049 .9088906
     95775  7403  88372 .9227043
     95775  7143  88632  .925419
     95775  8087  87688 .9155625
     95775  7969  87806 .9167946
     95775  8718  87057 .9089742
     95775  9520  86255 .9006004
     95775 10442  85333 .8909736
     95775 10750  85025 .8877578
    177745 16050 161695 .9097021
    177745 15394 162351 .9133928
    177745 13266 164479  .925365
    177745 12818 164927 .9278855
    177745 15185 162560 .9145686
    177745 15642 162103 .9119975
    177745 17058 160687  .904031
    177745 19358 158387 .8910912
    177745 20573 157172 .8842555
    177745 21235 156510 .8805311
    113501 12194 101307 .8925648
    113501 11287 102214  .900556
    113501  9433 104068 .9168906
    113501  8909 104592 .9215073
    113501  9387 104114 .9172959
    113501  9722 103779 .9143444
    113501 10509 102992 .9074105
    113501 11074 102427 .9024326
    113501 12377 101124 .8909525
    113501 13478 100023 .8812522
     95968 10023  85945  .895559
     95968  8881  87087 .9074587
     95968  7851  88117 .9181915
     95968  7575  88393 .9210674
     95968  8208  87760 .9144715
     95968  8541  87427 .9110016
     95968  9184  86784 .9043014
     95968 10178  85790 .8939438
     95968 11004  84964 .8853368
     95968 11780  84188 .8772507
    351559 42235 309324 .8798637
    351559 38382 313177 .8908234
    351559 32153 319406 .9085417
    351559 30302 321257 .9138068
    351559 30809 320750 .9123647
    351559 29409 322150 .9163469
    351559 30123 321436 .9143159
    351559 31643 319916 .9099923
    351559 33162 318397 .9056716
    351559 32713 318846 .9069487
    175916 19406 156510  .889686
    175916 17820 158096 .8987017
    175916 15699 160217 .9107586
    175916 14735 161181 .9162384
    175916 15151 160765 .9138737
    175916 14521 161395 .9174549
    175916 14799 161117 .9158746
    175916 15470 160446 .9120603
    175916 16124 159792 .9083426
    175916 16462 159454 .9064212
    288623 29108 259515 .8991487
    288623 28747 259876 .9003995
    288623 25727 262896  .910863
    288623 24087 264536 .9165451
    288623 25380 263243 .9120652
    288623 24481 264142   .91518
    288623 25336 263287 .9122177
    288623 27194 261429 .9057802
    288623 28310 260313 .9019136
    288623 28877 259746 .8999491
    207856 24487 183369 .8821925
    207856 23051 184805 .8891011
    207856 20407 187449 .9018214
    207856 19271 188585 .9072868
    207856 19083 188773 .9081913
    207856 18135 189721 .9127521
    207856 18385 189471 .9115493
    207856 18136 189720 .9127473
    207856 18978 188878 .9086964
    207856 19782 188074 .9048284
    175349 22806 152543 .8699394
    175349 20677 154672 .8820809
    175349 18169 157180 .8963838
    175349 16443 158906  .906227
    175349 15739 159610 .9102418
    175349 14659 160690  .916401
    175349 14408 160941 .9178324
    175349 15161 160188 .9135382
    175349 15751 159598 .9101734
    175349 16511 158838 .9058392
    end
    Any ideas appreciated.

    Cheers,
    Paris

  • #2
    Well, the example data you show has 98% nacio == "PT", so it does not seem very representative of the full data set where you expect it to be about 5%.

    Putting that aside, when you write -gen shr_immg=wo_for/(weight)-, you are calculating shr_immg as the proportion that are not nacio == "PT". So, if about 5% have nacio == "PT", then the proportion that are not will be about 95%. So it seems you are in fact getting a correct calculation for the proportion that is nacio != "PT". If you want the proportion that are nacio == "PT" then it should be -gen shr_nacio_PT = wo_nat/weight-.

    Comment


    • #3
      Thank you Prof Clyde.

      Comment

      Working...
      X