Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Selection by zone and rank


    Hi,

    I want to select 10 districts out of 24 districts by zone and a rank variable. There are 5 zones in the data. Under each zone I want to select 2 districts which should be on lowest and highest rank value (poor performing and best performing districts). A sample data set is pasted below. Kindly suggest some codes, I shall be thankful to you for this.

    With Thanks
    Harish


    district children(6-59 months) children(5-9 years) adolescents(10-19 years) pregnant_women average_value rank zone zonecode
    Pashchimi Singhbhum 5.4 12.5 16.3 86.2 30.1 13 Kolhan division 1
    Purbi Singhbhum 1.0 3.3 23.2 88.0 28.9 15 Kolhan division 1
    Saraikela 3.2 3.0 24.1 81.0 27.8 17 Kolhan division 1
    Dhanbad 1.4 13.4 95.0 67.5 44.3 2 North Chotanagpur division 2
    Kodarma 4.5 10.3 46.4 73.5 33.7 7 North Chotanagpur division 2
    Bokaro 1.4 11.9 44.5 71.5 32.3 8 North Chotanagpur division 2
    Giridih 1.7 10.3 11.6 90.6 28.6 16 North Chotanagpur division 2
    Ramgarh 4.0 0.0 18.1 86.3 27.1 19 North Chotanagpur division 2
    Chatra 2.2 0.0 9.0 95.0 26.6 20 North Chotanagpur division 2
    Hazaribagh 3.1 0.0 0.0 95.0 24.5 22 North Chotanagpur division 2
    Palamu 0.4 0.0 39.9 87.0 31.8 10 Palamu division 3
    Latehar 3.1 0.6 10.6 95.0 27.3 18 Palamu division 3
    Garhwa 1.8 0.2 4.2 95.0 25.3 21 Palamu division 3
    Dumka 8.4 30.7 39.2 86.8 41.3 3 Santhal Pargana division 4
    Deoghar 1.4 5.1 64.6 69.3 35.1 4 Santhal Pargana division 4
    Godda 1.5 5.1 46.3 87.0 35.0 5 Santhal Pargana division 4
    Pakur 1.1 16.6 23.6 79.5 30.2 12 Santhal Pargana division 4
    Jamtara 2.4 0.0 0.0 77.6 20.0 23 Santhal Pargana division 4
    Sahibganj 1.9 0.0 0.0 70.8 18.2 24 Santhal Pargana division 4
    Lohardaga 5.8 35.7 76.4 92.6 52.6 1 South Chotanagpur division 5
    Simdega 0.9 1.9 38.1 95.0 34.0 6 South Chotanagpur division 5
    Khunti 7.7 3.3 36.0 81.4 32.1 9 South Chotanagpur division 5
    Ranchi 0.7 18.4 52.2 55.0 31.6 11 South Chotanagpur division 5
    Gumla 3.2 9.2 56.8 49.2 29.6 14 South Chotanagpur division 5

  • #2
    Hi Harish,

    If I understand you correctly, you would like to find out what the highest and lowest ranked district within each zone are.


    Try this.

    Code:
    bysort zone: egen group_rank = rank(rank)
    
    encode zone, gen(zone_coded)
    
     levelsof zone_coded, local(divs)
    
     foreach d of local divs{
    
     di "Rankings of districts in Zone" " " `d'
     list zone district group_rank if zone_coded == `d'
    
     }
    Should give you a printout for the districts in every zone, with a group rank attached.

    Best,
    Lakshman
    Last edited by Lakshman Balaji; 27 Nov 2019, 09:34.

    Comment


    • #3
      Another way to do this would be:

      Code:
      
      bysort zone: egen group_rank = rank(rank)
      bysort zone: egen Minimum = min(group_rank)
      bysort zone: egen Maximum = max(group_rank)
      
      gen samemin = (group_rank == Minimum)
      gen samemax = (group_rank == Maximum)
      
      
      encode zone, gen(zone_coded)
      
       levelsof zone_coded, local(divs)
      
       foreach d of local divs {
      
       di "Minimum ranked district in Zone" " " `d'
       list zone district group_rank if zone_coded == `d' & samemin == 1
       di "Maximum ranked district in Zone" " " `d'
       list zone district group_rank if zone_coded == `d' & samemax == 1
       
       }

      Best,
      Lakshman


      Comment


      • #4
        Thanks Lakshman!

        The second set of codes is fine for my purpose. I am not generating any other rank variable but using the one which is there in my data.
        I was using the 'sample' command in stata and it was returning me the only selected districts list in my data browser. I was wondering if same thing can be applied here as well.

        Thanks
        Harish

        Comment


        • #5
          Harish,

          I too used the same rank variable that was already there in your data. The group_rank variable that I created was just to help identify the first and last ranking districts within zones.

          Yes, this code should work on the entire dataset as well. The printed output with the messages might be too large, though. In that case, I would recommend subsetting the data using the samemin column to get a dataset of all the minimum ranked districts, using the samemax column to get a dataset of all the maximum ranked districts, and then joining the two datasets.

          Best,
          Lakshman
          Last edited by Lakshman Balaji; 28 Nov 2019, 09:50.

          Comment


          • #6
            Thanks Lakshman!

            Comment

            Working...
            X