Obtaining all data entries in one most recent row per ID

Maxence Morlet

Join Date: Mar 2021

Posts: 653
#1

Obtaining all data entries in one most recent row per ID

19 Nov 2021, 05:39

Hi all,

First of all, here is my data:

Code:

input long id float(gradeCH_SF_HK1 gradeGP_EF_PP gradeGP_EF_ZK1 gradeEN_ZK) 599518 . . 5.5 . 599518 . 5 . . 599518 . . . 5

I have several IDs, I chose to show this one for conciseness and because it represents a good example of what I would like to do.

Right now, the dataset is as follows (it is not xtset: there is no time variable):

id gradeCH_SF_HK1 gradeGP_EF_PP gradeGP_EF_ZK1 gradeEN_ZK

599518 . . 5.5 .

599518 . 5 . .

599518 . . . 5

I would like it to be like this:

id gradeCH_SF_HK1 gradeGP_EF_PP gradeGP_EF_ZK1 gradeEN_ZK

599518 . . 5.5 .

599518 . 5 . .

599518 . 5 5.5 5

So that all the information by ID is in the most recent observation (row). After that, I should enable me to run

Code:

keep if _n==_N

with all the data being contained in the last row.

I naively tried running

Code:

bysort id: replace grade*[_N]==grade*[_n-1] if grade[_N]==. & grade*[_n-1]!=.

but got the error "weights not allowed". In addition, I am not even sure that this is the right command to execute to obtain what I want.

Any ideas on how to proceed?

Many thanks!
Maxence Morlet
Tags: None
Øyvind Snilsberg

Join Date: Oct 2021

Posts: 591
#2

19 Nov 2021, 05:48

if nonmissing values don't vary by id,

Code:

foreach v of var grade* { bys id (`v'): replace `v' = `v'[1] }

Last edited by Øyvind Snilsberg; 19 Nov 2021, 05:52.
Comment

Maarten Buis

Join Date: Mar 2014
Posts: 3459

19 Nov 2021, 05:57

Code:

clear
input long id float(gradeCH_SF_HK1 gradeGP_EF_PP gradeGP_EF_ZK1 gradeEN_ZK)
599518 . . 5.5 .
599518 . 5   . .
599518 . .   . 5
end

bys id: gen order = _n
gen valid_index = .
foreach var of varlist grade* {
    replace valid_index = !missing(`var')*order
    sum valid_index, meanonly
    if r(max) > 0 {
        bys id (order): replace `var' = `var'[r(max)] if _n == _N
    }
}
drop valid_index
list

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------

Comment

Maxence Morlet

Join Date: Mar 2021

Posts: 653
#4

19 Nov 2021, 06:48

Thank you very much for your prompt reply!

How would your code change if nonmissing values were to vary by ID?

Many thanks!
Maxence
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35725

19 Nov 2021, 06:55

Code:

 
 sort id, stable   foreach v of var grade* {     by id: replace `v' = `v'[_n-1] if missing(`v')  }  by id : keep if _n == _N

Comment

Maxence Morlet

Join Date: Mar 2021

Posts: 653
#6

19 Nov 2021, 07:44

Many thanks to all three of you!
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35725

19 Nov 2021, 08:18

#5 should have been

Code:

sort id, stable

foreach v of var grade* {
      by id: replace `v' = `v'[_n-1] if missing(`v')
}

by id : keep if _n == _N

But note also

Code:

collapse (lastnm)

id	gradeCH_SF_HK1	gradeGP_EF_PP	gradeGP_EF_ZK1	gradeEN_ZK
599518	.	.	5.5	.
599518	.	5	.	.
599518	.	.	.	5

Announcement

Obtaining all data entries in one most recent row per ID

Comment

Comment

Comment

Comment

Comment

Comment