panel data xtabs first and last occurrence

Samira Soleimanpour

Join Date: Feb 2019

Posts: 4
#1

panel data xtabs first and last occurrence

25 Feb 2019, 12:58

Hello,

I am new to the Stata community and was hoping for some help with panel data analysis. I have health care data with 'clientid' as the panel variable. However, a client may have multiple visits on the same day with different or missing data recorded on the outcome variable (surveyid provides a unique identifier for observations on the same day). Here is a sample of my dataset:

clientid date medfreq severity surveyid
1 01/02/18 1 4 12345
1 02/21/18 4 12789
1 03/08/18 12890
1 06/15/18 3 2 12990
2 01/02/18 4 12681
2 01/02/18 2 3 12688
2 03/08/18 1 3 13450
2 06/15/18 3 3 13560

I have declared my dataset as panel data without indicating the time variable since there are repeated dates, which I want to preserve. Is this the correct approach?

I want to identify the first and last occurrences that are ≥3 months apart with data recorded on "medfreq." I would then like to run a xtab to see how clients changed over time from their first to last occurrence. I didn't find any literature on this specific issue but would be grateful for any suggested code or if someone can kindly refer me to an existing reference.

Thank you,
Samira

Last edited by Samira Soleimanpour; 25 Feb 2019, 13:10.
Tags: panel data
Nick Cox

Join Date: Mar 2014

Posts: 35782
#2

25 Feb 2019, 14:13

Your example implies that your aim is not well-defined. For example, individual 2 has two observations for 1 Feb 2018. Which is to be regarded as "first"?
Comment
Samira Soleimanpour

Join Date: Feb 2019

Posts: 4
#3

25 Feb 2019, 14:23

Thank you, Nick. The "first" would have to be identified by either the first observation where the data appear in that field and/or the smaller surveyid. Does that help clarify?
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35782
#4

25 Feb 2019, 18:22

OK, but your data example doesn't make full sense as you have different numbers of values in each observation. Nevertheless this may help.

After sorting, the first and last measurements are directly available, but the trick is to put values only in one observation for each client to avoid multiple counting. That must be the same observation to allow tabulation to take place at all. Note that lag operators are out of the question here unless you have an artificial time variable, which you don't really need.

Code:

clear input clientid str8 date medfreq severity surveyid 1 "01/02/18" 1 4 12345 1 "02/21/18" 4 . 12789 1 "03/08/18" . . 12890 1 "06/15/18" 3 2 12990 2 "01/02/18" 4 . 12681 2 "01/02/18" 2 3 12688 2 "03/08/18" 1 3 13450 2 "06/15/18" 3 3 13560 end bysort clientid (surveyid) : gen first = medfreq[1] if _n == 1 by clientid : gen last = medfreq[_N] if _n == 1 tab first last | last first | 3 | Total -----------+-----------+---------- 1 | 1 | 1 2 | 1 | 1 -----------+-----------+---------- Total | 2 | 2

There is at least one extra twist. Suppose a patient is recorded only once but you don't want the last observation (of one) to be regarded as the last observation because it wasn't at a different time. Then one command should be different:

Code:

by clientid : gen last = medfreq[_N] if _n == 1 & _N > 1
Comment
Samira Soleimanpour

Join Date: Feb 2019

Posts: 4
#5

26 Feb 2019, 11:02

Hi Nick,

The coding worked, thank you! Do you know how I can get Stata to ignore missing values and only use the first (and last) occurrence where data appear in the "medfreq" field? The sample data I included is part of a larger dataset (n=2,000+) where clients often have missing values in their first and/or last visits but will have values recorded during their visits in between.

Thanks again,
Samira
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35782

26 Feb 2019, 11:20

Something like (not tested)

Code:

gen ismissing = missing(medfreq)
bysort ismissing clientid (surveyid) : gen first = medfreq[1] if _n == 1 & !ismissing  
 by ismissing clientid : gen last = medfreq[_N] if _n == 1 & !ismissing

Announcement

panel data xtabs first and last occurrence

Comment

Comment

Comment

Comment

Comment