How to keep observations according to several variable?

Zihan Dong

Join Date: Feb 2021

Posts: 44
#1

How to keep observations according to several variable?

23 Feb 2021, 13:26

Hi,

I have a dataset looks like the following. What I want to do is to identify observations with exactly the same id, diagnose and t (observations noted in bold), and among them only keep the observations appearing at the first time in the dataset.

id diagnose t age

1 ssc 0 22

2 ssc 0 67

2 ibd 1 55

2 ssc 0 24

2 tb 0 78

2 tb 1 35

3 ssc 0 64

3 ssc 0 42

3 ibd 0 53

This is what I expect to get (aka. the red observations in the above table should be removed):

id diagnose t age

1 ssc 0 22

2 ssc 0 67

2 ibd 1 55

2 tb 0 78

2 tb 1 35

3 ssc 0 64

3 ibd 0 53

Could anyone help with this? thank you very much in advance!

Best,
Z
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30358
#2

23 Feb 2021, 13:44

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input byte id str4 diagnose byte(t age) 1 "ssc " 0 22 2 "ssc " 0 67 2 "ibd " 1 55 2 "ssc " 0 24 2 "tb " 0 78 2 "tb " 1 35 3 "ssc " 0 64 3 "ssc " 0 42 3 "ibd " 0 53 end gen long obs_no = _n by id diagnose t (obs_no), sort: keep if _n == 1 sort obs_no

In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
1 like
Comment
Zihan Dong

Join Date: Feb 2021

Posts: 44
#3

23 Feb 2021, 15:31

Originally posted by Clyde Schechter View Post

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input byte id str4 diagnose byte(t age) 1 "ssc " 0 22 2 "ssc " 0 67 2 "ibd " 1 55 2 "ssc " 0 24 2 "tb " 0 78 2 "tb " 1 35 3 "ssc " 0 64 3 "ssc " 0 42 3 "ibd " 0 53 end gen long obs_no = _n by id diagnose t (obs_no), sort: keep if _n == 1 sort obs_no

In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

Thank you very much Clyde!! my problem was perfectly solved and the tip of using -dataex- is super helpful also!
Comment

id	diagnose	t	age
1	ssc	0	22
2	ssc	0	67
2	ibd	1	55
2	ssc	0	24
2	tb	0	78
2	tb	1	35
3	ssc	0	64
3	ssc	0	42
3	ibd	0	53

id	diagnose	t	age
1	ssc	0	22
2	ssc	0	67
2	ibd	1	55
2	tb	0	78
2	tb	1	35
3	ssc	0	64
3	ibd	0	53

Announcement

How to keep observations according to several variable?

Comment

Comment