Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dropping lowest n values of a group variable

    Hi, my question is a bit silly but I am just not being very bright today

    As you can see from the dataset below, I have arranged scores in a descending order. I want a shortlist that has more women than men, so let's say I want to drop lowest n number of observations but they should only be men - how do I do it? Here women are 1 and men are 0

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(scgender score)
    1 10
    1 9.5
    1 9.5
    1 8.5
    1 8.5
    1 8.5
    1 8.5
    1 8.25
    1 8
    0 8
    0 8
    0 8
    0 8
    1 8
    1 8
    0 7.75
    1 7.75
    1 7
    1 7
    1 7
    1 7
    1 7
    1 7
    1 7
    1 7
    1 7
    1 7
    1 7
    1 7
    1 7
    1 6.75
    1 6.75
    1 6.75
    1 6.75
    1 6.75
    1 6.75
    1 6.75
    1 6.75
    1 6.75
    1 6.75
    1 6.75
    0 6.5
    0 6.5
    0 6.5
    0 6.5
    1 6.5
    1 6.5
    0 6.5
    0 6.5
    0 6.5
    1 6.5
    1 6.5
    0 6.5
    1 6.5
    0 6.5
    0 6.5
    0 6.5
    0 6.5
    0 6.5
    1 6.5
    0 6.5
    0 6.5
    0 6.5
    0 6.5
    0 6.5
    0 6.5
    0 6.5
    0 6.5
    1 6.5
    0 6.5
    0 6.5
    1 6.5
    0 6.5
    0 6.5
    0 6.5
    0 6.5
    0 6.5
    1 6.5
    0 6.5
    0 6.5
    0 6.5
    1 6.5
    0 6.5
    0 6.5
    1 6.5
    0 6.5
    0 6.5
    0 6.25
    1 6.25
    1 6.25
    0 6.25
    1 6.25
    1 6.25
    1 6.25
    1 6.25
    0 6.25
    0 6.25
    0 6.25
    1 6.25
    0 6.25
    end

  • #2
    Kartikeya:
    assuming that your cut-off for male score is, say, 6.5:
    Code:
    drop if score<=6.5 & scgender==0
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      You have a major problems with ties there.


      Click image for larger version

Name:	scores_ranks.png
Views:	1
Size:	38.2 KB
ID:	1571799

      Comment


      • #4
        If women are 1, and men are 0s (from heroes to zeroes, oh what a sad faith), and then n=50, then you can try

        Code:
        . gsort -scgender -score
        
        . drop in 50/l
        (51 observations deleted)
        In the above example of n=50 this procedure drops all men and then some of the women with lowest scores.

        Comment

        Working...
        X