Finding the observation number of the smallest/largest value in a data set.

Jeff Wooldridge

Join Date: Apr 2014

Posts: 2204
#1

Finding the observation number of the smallest/largest value in a data set.

14 Jun 2023, 17:37

Suppose I have data on a variable x, and I want to find the observation number associated with the minimum and maximum value in the data. I know I can get the min and max easily, and then I can write a loop. Checking whether there's an easier solution.
Tags: None
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2406
#2

14 Jun 2023, 17:45

The trick here is to use an intermediate variable to identify those which are equal to the min/max. However, you'll have to be mindful of what you want to do if you have multiple observations at the min/max value. The code below just lists out all observation numbers, so you could use that potentially.

Code:

sysuse auto keep mpg summ mpg, meanonly gen byte which_min = cond(mpg == r(min), _n, .) gen byte which_max = cond(mpg == r(max), _n, .) * min and max summary statistics now indicate first and last obs number for the min and max values summ which_min summ which_max
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35780
#3

14 Jun 2023, 17:49

Well, holding observation number in a byte will undoubtedly bite (*) you hard with a large dataset.

In general,

Code:

gen long obsno = _n summarize foo, meanonly levelsof obsno if foo == r(min)

and so forth.

(*) Worst joke on Statalist since 1994.
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2406
#4

14 Jun 2023, 19:38

Originally posted by Nick Cox View Post

Well, holding observation number in a byte will undoubtedly bite (*) you hard with a large dataset.

(*) Worst joke on Statalist since 1994.

True on both counts.
Indeed long (or c(obs_t)) is the more appropriate type.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35780
#5

15 Jun 2023, 10:04

This problem was also discussed in the Stata Journal in 2006

https://journals.sagepub.com/doi/pdf...867X0600600313

https://journals.sagepub.com/doi/pdf...867X0600600414
1 like
Comment

Announcement

Finding the observation number of the smallest/largest value in a data set.

Comment

Comment

Comment

Comment