using variable name to calculate time interval

LarsFolkestad

Join Date: Sep 2014
Posts: 165

using variable name to calculate time interval

23 May 2016, 07:50

Dear listers
I have a dataset of many dates all named
date_acl*, where the * can be between 1-N

and

H_scan_date*, where the * can be between 1-N

Now, i what to figure out if pr ID the larges scan_date is after the largest var_date

for exemple for id 7 er H_Scan_date4 later in time than date_acl5, keeping in mind that in this data set i have omitted almost all numbers for the *
and i want to compare the latest (e.g. the highest number) date for both H_Scan_date and date_acl and the numbers may vary from 1 to 15 and are not necessarily equal.
So it could be H_scan_date4 and date_acl10.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float id double(date_acl1 H_Scan_date1 date_acl5 H_Scan_date4)
 1 19698                  .     .                  .
 2 20066                  .     .                  .
 3 17749                  . 18890                  .
 4 18568                  . 19429                  .
 5 18974                  .     .                  .
 6 17664                  . 19142                  .
 7 17805 18169.400648148097 18932 19673.500787037003
 8 19481 18345.712442129603     .                  .
 9 17658 17605.401840277802     .                  .
10 17798                  . 19264                  .
11 20199                  .     .                  .
12 19542                  .     .                  .
13 19401                  .     .                  .
14 17784                  .     .                  .
15 18977                  .     .                  .
end
format %td date_acl1
format %tdnn/dd/CCYY H_Scan_date1
format %td date_acl5
format %tdnn/dd/CCYY H_Scan_date4

hope that made sense.

Lars

Tags: None

Carole J. Wilson

Join Date: Jan 2015
Posts: 932

23 May 2016, 08:47

This should give you two variables--last_scan & last_acl. The variables contain the dates of the variable with the largest * that is non-missing for H_Scan_date* and date_acl* (where * may be different for scan and acl):

Code:

gen last_acl=.
gen last_scan=.

local n=   //put the largest possible number of *
forvalues i=`n'(-1)1 {    //or replace the last 1 with the smallest value of *
    cap replace last_acl=date_acl`i' if !mi(date_acl`i') & !mi(last_acl)
    cap replace last_scan=H_Scan_date`i' if !mi(H_Scan_date`i') & !mi(last_scan)
    }

format last_acl %td
format last_scan %td

Stata/MP 14.1 (64-bit x86-64)
Revision 19 May 2016
Win 8.1

Comment

LarsFolkestad

Join Date: Sep 2014

Posts: 165
#3

23 May 2016, 11:01

Dear Carole

Thank you for that. I try it out asap.

Can you help me with how to calculate the time between any two values for the * as well? For instance Maximum for * and Maximum-1 for *?

Lars
Comment
Carole J. Wilson

Join Date: Jan 2015

Posts: 932
#4

23 May 2016, 13:02

I really think that you need to reshaping your data to long form (see help reshape). This would make life much easier.

In general, you can get the number of days between two %td date variables by just subtracting the two:

Code:

gen difference= date2-date1

Stata/MP 14.1 (64-bit x86-64)
Revision 19 May 2016
Win 8.1
Comment
LarsFolkestad

Join Date: Sep 2014

Posts: 165
#5

23 May 2016, 22:55

Yes i would like to reshape to long, but have reshaped to wide to avoid M:M merging.

I know that i can substract dates directly. My problem is, tough that on id can have 15 date_acl and another can have 3, and the number og H_scan_dates varies too.

I was thinking something along the lines of:

Generating a sequence_number for each variable. Identifying the maximum number of this sequence number, and then subtracing the dates with maximum-1 sequence number from the date with the maximum.

The code seems blurry too me.
Comment
Carole J. Wilson

Join Date: Jan 2015

Posts: 932
#6

24 May 2016, 08:17

I would suggest reshaping to long, performing calculations, then reshape to wide if you need it in that format to merge. I don't know you situation, but there is almost never a reason to perform a m:m merge.

Stata/MP 14.1 (64-bit x86-64)
Revision 19 May 2016
Win 8.1
Comment

Announcement

using variable name to calculate time interval

Comment

Comment

Comment

Comment

Comment