Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dropping observations

    Hello,

    In the below dataset:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str10 var1 float aus_20
    "AUS_01T02" 173.19974
    "AUS_03"     6.068735
    "AUS_05T06" 1065.3801
    "AUS_07T08" 139.02173
    "AUS_09"    12.974132
    "AUS_10T12"  218.1521
    "AUS_13T15"  45.22264
    "AUS_16"     6.384057
    "AUS_17T18" 136.45459
    "AUS_19"     353.1058
    "AUS_20"    1214.2129
    "AUS_21"     40.45226
    "AUS_22"    279.55978
    "AUS_23"    128.08464
    "AUS_24"    24.676386
    "AUS_25"    116.48367
    "AUS_26"    12.369787
    "AUS_27"    11.558052
    "AUS_28"     38.23451
    "AUS_29"     9.646472
    "AUS_30"     9.712774
    "AUS_31T33"  23.66711
    "AUS_35"     306.7819
    "AUS_36T39" 206.86343
    "AUS_41T43" 119.65153
    "AUS_45T47" 1898.5074
    "AUS_49"    461.25095
    "AUS_50"    72.309204
    "AUS_51"     51.45259
    "AUS_52"    533.84503
    "AUS_53"    31.065104
    "AUS_55T56" 175.14697
    "AUS_58T60"  78.60828
    "AUS_61"     88.03191
    "AUS_62T63" 114.50014
    "AUS_64T66"  325.7633
    "AUS_68"      40.6893
    "AUS_69T75"  603.0026
    "AUS_77T82"  293.4053
    "AUS_84"     283.8955
    "AUS_85"     24.83051
    "AUS_86T88" 137.57274
    "AUS_90T93"   7.20997
    "AUS_94T96"  27.93785
    "AUS_97T98"         0
    "AUT_01T02"   .003622
    "AUT_03"            0
    "AUT_05T06"   .004057
    "AUT_07T08"   .010179
    "AUT_09"     6.00e-06
    "AUT_10T12"   .087225
    "AUT_13T15"   .040663
    "AUT_16"      .021726
    "AUT_17T18"    .14133
    "AUT_19"      .092674
    "AUT_20"     1.062999
    "AUT_21"      .084466
    "AUT_22"      .314744
    "AUT_23"      .111977
    "AUT_24"      .101737
    "AUT_25"       .28396
    "AUT_26"      .011893
    "AUT_27"      .026018
    "AUT_28"      .293965
    "AUT_29"      .013614
    "AUT_30"      .000609
    "AUT_31T33"    .01695
    "AUT_35"      .003598
    "AUT_36T39"   .003891
    "AUT_41T43"   .010193
    "AUT_45T47"   .433097
    "AUT_49"      .289404
    "AUT_50"      .001737
    "AUT_51"       .01264
    "AUT_52"      .090495
    "AUT_53"      .003376
    "AUT_55T56"   .001819
    "AUT_58T60"   .010415
    "AUT_61"      .010002
    "AUT_62T63"   .016576
    "AUT_64T66"    .01102
    "AUT_68"      .001636
    "AUT_69T75"   .050557
    "AUT_77T82"   .013966
    "AUT_84"      .001904
    "AUT_85"      .000831
    "AUT_86T88"   .001723
    "AUT_90T93"   .000819
    "AUT_94T96"   .000823
    "AUT_97T98"         0
    end
    I want to drop observations from AUS_24 to AUS_97T98 and then AUT_24 to AUT_97T98 of variable 1. There are other countries in the dataset for which the same process of deletion is required. Is there a way to delete them at once?

    Thanks



  • #2
    I don't know if there is a way to apply range on string, but it's possible to extract the 5th and 6th digit as numbers and then drop if they fall between 24 and 97:

    Code:
    gen temp_index = real(substr(var1), 5, 2)
    drop if inrange(temp_index, 24, 97)
    drop temp_index
    Results:

    Code:
         +----------------------+
         |      var1     aus_20 |
         |----------------------|
      1. | AUS_01T02   173.1997 |
      2. |    AUS_03   6.068735 |
      3. | AUS_05T06    1065.38 |
      4. | AUS_07T08   139.0217 |
      5. |    AUS_09   12.97413 |
      6. | AUS_10T12   218.1521 |
      7. | AUS_13T15   45.22264 |
      8. |    AUS_16   6.384057 |
      9. | AUS_17T18   136.4546 |
     10. |    AUS_19   353.1058 |
     11. |    AUS_20   1214.213 |
     12. |    AUS_21   40.45226 |
     13. |    AUS_22   279.5598 |
     14. |    AUS_23   128.0846 |
         |----------------------|
     15. | AUT_01T02    .003622 |
     16. |    AUT_03          0 |
     17. | AUT_05T06    .004057 |
     18. | AUT_07T08    .010179 |
     19. |    AUT_09   6.00e-06 |
     20. | AUT_10T12    .087225 |
     21. | AUT_13T15    .040663 |
     22. |    AUT_16    .021726 |
     23. | AUT_17T18     .14133 |
     24. |    AUT_19    .092674 |
     25. |    AUT_20   1.062999 |
     26. |    AUT_21    .084466 |
     27. |    AUT_22    .314744 |
     28. |    AUT_23    .111977 |
         +----------------------+

    Comment


    • #3
      Ken Chui inrange() supports string arguments:

      Code:
      . count if inrange(var, "AUS_24", "AUS_97T98") | inrange(var, "AUT_24", "AUT_97T98") 
        62

      The question is just empirical, whether a range catches what you want to catch and nothing else.

      Comment


      • #4
        Thank you, Ken and Nick. I tried it with the split command and it worked.

        Comment

        Working...
        X