Maximum value across rows

Anoush Khachatryan

Join Date: Sep 2021

Posts: 56
#1

Maximum value across rows

09 Sep 2023, 16:17

Hello all,

I have the data below:

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input long ID double(time miles cars stops1 stops2 stops3) 1 162 5 40 27 6 1 163 7 42 32 1 164 7 43 41 1 165 2 47 48 2 162 10 71 39 7 4 2 163 11 73 42 2 164 9 78 58 2 165 6 82 61 end format %tq time

I want to find the maximum value across the stops rows. How can I do this if each ID has a different number of stops? This is just sample data, but the real data has many different stops.

I would appreciate any help.

A

Last edited by Anoush Khachatryan; 09 Sep 2023, 16:26.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30164
#2

09 Sep 2023, 17:12

Code:

egen wanted = rowmax(stops*)
Comment

Anoush Khachatryan

Join Date: Sep 2021
Posts: 56

09 Sep 2023, 19:04

Clyde Schechter Thank you! This code worked very well.

I have another question. Using the same data, how would I calculate the largest value by row for only stops 4, 5, and 6?

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input long ID double(time miles cars stops1 stops2 stops3 stops4 stops5 stops6)
1 162 5 40 27 6  6 5 3
1 163 7 42 32
1 164 7 43 41
1 165 2 47 48
2 162 10 71 39 7 4 9 10 22
2 163 11 73 42
2 164 9 78 58
2 165 6 82 61
end
format %tq time

I would appreciate any assistance.

A

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30164
#4

09 Sep 2023, 19:09

Code:

egen wanted2 = rowmax(stops4 stops5 stops6)

Added:
Or, on the assumption that, as in your example, these variables appear as a consecutive block in your data set:

Code:

egen wanted2 = rowmax(stops4-stops6)
Comment
Anoush Khachatryan

Join Date: Sep 2021

Posts: 56
#5

09 Sep 2023, 19:17

Clyde Schechter Thank you for your response!

I apologize, I should have explained a bit better. Since this is only a sample of my data, there are many ID values with many different stops. Due to the size of my data, I want to find the maximum row value for stops by excluding the first X amount of stops based on an equation.

For example, my equation is X=10-7 for ID==1 and X=10-5 for ID==2. Therefore, I want to exclude the first 3 stops from ID==1 and the first 5 stops for ID==2. Is there a some way I can automate this for the many ID variables I have all with a different amount of stops?

Anoush

Last edited by Anoush Khachatryan; 09 Sep 2023, 19:31.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30164
#6

10 Sep 2023, 10:08

In principle, yes. But it cannot be done with the example data you show, because while you may have in your head some information about which stops to include for which id, there is nothing in the example data that provides this information. Please post back with a new data example (using -dataex-, of course) that contains the additional variable(s) needed to see which stops to count for each id.
1 like
Comment

Anoush Khachatryan

Join Date: Sep 2021
Posts: 56

10 Sep 2023, 12:13

Sorry about that. Here is an example of the data:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input long ID double(time miles cars X stops1 stops2 stops3 stops4 stops5 stops6)
1 162 5 40 3 27 6 6 5 3
1 163 7 42 3
1 164 7 43 3
1 165 2 47 3
2 162 10 71 5 39 7 4 9 10 22
2 163 11 73 5
2 164 9 78 5
2 165 6 82 5
3 162 5 7 1 4 2 10 9 
3 163 6 7 1
3 164 8 3 1
3 165 6 2 1
end
format %tq time

For ID==1, X==3 so I would exclude the first three stops (27, 6, and 6) and only find the row max between 5 and 3. For ID==2, X==5 so I would exclude the first five stops, etc.

Anoush

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30164
#8

10 Sep 2023, 12:53

Code:

* Example generated by -dataex-. For more info, type help dataex clear input byte ID int time byte(miles cars X stops1 stops2 stops3 stops4 stops5 stops6) 1 162 5 40 3 27 6 6 5 3 . 1 163 7 42 3 . . . . . . 1 164 7 43 3 . . . . . . 1 165 2 47 3 . . . . . . 2 162 10 71 5 39 7 4 9 10 22 2 163 11 73 5 . . . . . . 2 164 9 78 5 . . . . . . 2 165 6 82 5 . . . . . . 3 162 5 7 1 4 2 10 9 . . 3 163 6 7 1 . . . . . . 3 164 8 3 1 . . . . . . 3 165 6 2 1 . . . . . . end isid ID time, sort reshape long stops, i(ID time) by ID time (_j), sort: egen wanted = total(cond(_j > X, stops, .)) reshape wide

The example data you show is not genuine -dataex- output. It does not run when used. -dataex- does not elide missing values, it explicitly represents them as . or "". In the future, use genuine -dataex- output to show example data: do not try to mock up your own and dress it up as -dataex-. In this instance, I found another way to import your data and then ran -dataex-: you can see how it looks in the code above. But it is not reasonable to expect that this sort of extra work will be undertaken in the general case.

Added: When the value of X leads to the exclusion of all non-missing values of the stops variables, this code produces 0 as the total. This is consistent with the mathematical definition of an empty sum. However, if you prefer to have that result as a missing value, add -, missing- to the end of the -egen- command.

Last edited by Clyde Schechter; 10 Sep 2023, 12:56.
1 like
Comment
Anoush Khachatryan

Join Date: Sep 2021

Posts: 56
#9

10 Sep 2023, 15:31

Clyde Schechter Thank you very much!! It works perfectly!
Comment

Announcement