Really need help! With the expert guidance of a statistician, I merged (using the append command) a baseline data set with three registry-based data sets (using the append command). Now, my goal is to only keep the registry-based HbA1c test dates (spanning from 01jan2008 to 31dec2020) that were performed over 1 year (364.25 days) after each participant's original HbA1c value, which was collected in each original study participant on a date between 01oct2008 to 01apr2010.
Before merging, the original study’s data set was labeled as baseline = 1 and had the following variables in as the columns in long format: ID_number, the HbA1c test date (status_date), the HbA1c value (HbA1c_mmolmol), the year the HbA1c test was done (y_status_date), the absolute # of days between each participant's birthday and the date of their HbA1c test (new_diff_days), baseline (labeled as 1 for the original study) and some string variables. Before merging / appending the data sets, the registry-based data was labeled as baseline = 1 and it has all of the same variable names and labels as the original study without any string variables. My question = What code should I use to only keep the registry-based (baseline == 0) HbA1c values and test dates that were done 1 year (364.25 days) after each participant's baseline study's HbA1c test date?
[CODE]
* Example generated by -dataex-. For more info, type help dataex
clear
input double(ID_number status_date HbA1c_mmolmol) float(y_status_date new_diff_days baseline)
000000000 17875 61 2008 343 0
111111111 17979 61 2009 81 1
222222222 18071 66 2009 173 0
333333333 18281 64. 2017 18 0
444444444 18788 62 2019 160 0
555555555 19025 59 2009 32 1
The above dataex example is a dummy data set. It is not my real data.
Before merging, the original study’s data set was labeled as baseline = 1 and had the following variables in as the columns in long format: ID_number, the HbA1c test date (status_date), the HbA1c value (HbA1c_mmolmol), the year the HbA1c test was done (y_status_date), the absolute # of days between each participant's birthday and the date of their HbA1c test (new_diff_days), baseline (labeled as 1 for the original study) and some string variables. Before merging / appending the data sets, the registry-based data was labeled as baseline = 1 and it has all of the same variable names and labels as the original study without any string variables. My question = What code should I use to only keep the registry-based (baseline == 0) HbA1c values and test dates that were done 1 year (364.25 days) after each participant's baseline study's HbA1c test date?
[CODE]
* Example generated by -dataex-. For more info, type help dataex
clear
input double(ID_number status_date HbA1c_mmolmol) float(y_status_date new_diff_days baseline)
000000000 17875 61 2008 343 0
111111111 17979 61 2009 81 1
222222222 18071 66 2009 173 0
333333333 18281 64. 2017 18 0
444444444 18788 62 2019 160 0
555555555 19025 59 2009 32 1
The above dataex example is a dummy data set. It is not my real data.
Comment