Group by order of event occurrence

Horia Marginean

Join Date: Jul 2025

Posts: 5
#1

Group by order of event occurrence

08 Feb 2021, 18:32

Hi,
It may be a relatively simple problem to order dates, but I cannot find an elegant solution.
I have the dates of several events. The events can occur in any order.
I would like to identify for each observation, the order of each event, and report event_1 1^st n,%, event_1 2^nd n,%, event_1 3^rd n,%....., event_2 1st n,%, event_2 2^nd n,%, ….. etc.
My solution is clumsy, loop by each observation, to identify event position in the sequence of events, 1^st,2^nd, 3^rd etc.
Thank you,
Horia
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#2

08 Feb 2021, 20:17

I can imagine several ways your data might be organized and coded, each of which would call for different code. I don't think anyone can give you specific advice until you post some example data. Use the -dataex- command to do that.

If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
Comment

Horia Marginean

Join Date: Jul 2025
Posts: 5

08 Feb 2021, 23:19

Good evening Clyde,

The following is a sample of my dateset: All dates correspond to an event.
I am using version 16

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte id int(dodx doctreo_s dlan_s dsx dcx)
 1 19667     . 20423 19824     .
 3 17829 18141 19905     .     .
 4 20431     . 20417 20431     .
 5 20328     . 20388     .     .
 6 20361     . 20408 20391     .
 7 13515     . 20480 14901     .
 8 19632     . 20458     . 19663
 9 20444     . 20464     .     .
10 19787     . 20492 20026     .
11 17667     . 20494 17834     .
12 20405 21686 20543     .     .
end
format %dM_d,_CY dodx
format %dM_d,_CY doctreo_s
format %dM_d,_CY dlan_s
format %dM_d,_CY dsx
format %dM_d,_CY dcx

I appreciate the help received from Stata community,
Horia

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35721
#4

09 Feb 2021, 03:38

https://www.stata-journal.com/articl...article=pr0046 may help. See the rowsort and rowranks commands.

That said, you may well be better off with a reshape long.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#5

09 Feb 2021, 09:30

Building on Nick's suggestions, I would recommend going to long layout. While -rowsort- and -rowranks- can find the ordering you need, the reporting you want to do will be easier in long layout, as will almost anything else you want to do afterwards. Wide layout is only useful in limited circumstances in Stata, and unless you know that you are doing something that is best done wide, your default data organization should be long.

One issue in converting to long is identifying all your date variables. There are a couple of easy ways that work in your example but might not work in your real data: they are all the variables except id, and they are all the variables that start with d. A more robust way perhaps is to exploit the fact that they are dates and have a date display format. I will rely on the latter:

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input byte id int(dodx doctreo_s dlan_s dsx dcx) 1 19667 . 20423 19824 . 3 17829 18141 19905 . . 4 20431 . 20417 20431 . 5 20328 . 20388 . . 6 20361 . 20408 20391 . 7 13515 . 20480 14901 . 8 19632 . 20458 . 19663 9 20444 . 20464 . . 10 19787 . 20492 20026 . 11 17667 . 20494 17834 . 12 20405 21686 20543 . . end format %dM_d,_CY dodx format %dM_d,_CY doctreo_s format %dM_d,_CY dlan_s format %dM_d,_CY dsx format %dM_d,_CY dcx ds, has(format %d*) local date_vars `r(varlist)' rename (`date_vars') when= reshape long when, i(id) j(event) string drop if missing(when) by id (when), sort: gen chron_order = _n by chron_order, sort: tab event by event, sort: tab chron_order

Now, I am not sure what you have in mind when you say "report event_1 1^st n,%, event_1 2^nd n,%, event_1 3^rd n,%....., event_2 1st n,%, event_2 2^nd n,%, ….. etc." But I imagine it is the output of one of the two -tab- commands at the end of this code. If that's not true, please post back illustrating what you want the output to look like.
Comment
Horia Marginean

Join Date: Jul 2025

Posts: 5
#6

09 Feb 2021, 12:43

Hi Nick, rowranks works great , I am also interested to see, if you don't mind, an example using reshape long

Clyde, I wonder if you have a different approach

Thank you
Horia
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35721
#7

09 Feb 2021, 13:03

Clyde Schechter gave both exhortation and example on reshape long and I am not clear what else you seek.
Comment
Horia Marginean

Join Date: Jul 2025

Posts: 5
#8

09 Feb 2021, 14:17

Sorry , I missed the post, slow refresh of the forum page.
Thank you,
Horia
1 like
Comment

Announcement

Group by order of event occurrence

Comment

Comment

Comment

Comment

Comment

Comment

Comment