Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Keep all the observations that have maximum value of a variable in each group

    Dear Statalist,

    Here is an example of my data, two of its variables are CaseID and Surveyyear:
    CaseID Surveyyear
    1 2004
    1 2004
    1 2004
    2 1999
    2 1999
    2 1999
    2 2000
    2 2000
    2 2000
    3 2002
    3 2002
    3 2009
    3 2010
    3 2010
    For each group of caseID that has more than 1 value of Surveyyear, I want to keep observations that have the maximum value of Surveyyear. If there is only 1 value of Surveyyear in a group, I'll keep all of the observations in that group. It means, I want to generate this following result:
    CaseID Surveyyear
    1 2004
    1 2004
    1 2004
    2 2000
    2 2000
    2 2000
    3 2010
    3 2010
    Could someone help me? I tried using 'collapse' and, 'keep if _n == _N', but only one observation in each group is kept, not all the observations that have the same value.
    Thank you so much in advance.
    Best regard,
    Cameron.

  • #2
    Code:
    bysort CaseID: egen maxyear = max(Surveyyear)
    keep if Surveyyear == maxyear
    Best wishes

    (Stata 18.0 MP)

    Comment


    • #3
      Code:
      bysort CaseId (SurveyYear): keep if Surveyyear==Surveyyear[_N]

      Comment


      • #4
        Both ways work well. Thank you guys a lot. :D

        Comment

        Working...
        X