Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging panel data and analysis

    Hello Statalisters

    I intend to analyse panel data (long format) on effects of individuals taking a particular food item "FO" over time on blood sugar. as part
    of my dissertation.

    Information on the start date and end dates are given for each particular food item for each individual. However,
    the a series of lab tests (FBS-fasting blood sugar) were taken between each start and end of each food item.
    My aim is to conduct survival analysis and also calculate the incidence rates of hyperglycemia (High blood sugar) .
    How do I merge this data so that I capture all the lab tests?


    Dataset 1: food items taken

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str12 id_patient str36 regimen long(food_start_date food_end_date)
    "022-5555" "     FO" 17342 18216
    "022-5555" "     FO" 18216 18596
    "022-5555" "     TO" 18596 18764
    "022-6256" "     TO" 19970 20495
    "022-6256" "     TO" 20495 20507
    "022-6256" "DEFAULT" 20507 21473
    "022-6256" "     TO" 17609 17784
    "022-6256" "     FO" 17784 18226
    "022-6256" "     FO" 18226 18301
    "022-6256" "     FO" 18301 18486
    "022-6256" "     TO" 18486 21300
    end
    format %d food_start_date
    format %d food_end_date
    Dataset 2: lab test

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str12 id str10 fbs_test long test_date
    "022-5555" "FBS"  8.1 17336
    "022-5555" "FBS" 10.0 17584
    "022-5555" "FBS" 13.2 17765
    "022-5555" "FBS"  4.6 18010
    "022-6256" "FBS"  3.3 16459
    "022-6256" "FBS"  7.8 17668
    "022-6256" "FBS" 15.6 18053
    "022-6256" "FBS"  6.6 18334
    "022-6256" "FBS"  7.8 18817
    "022-6256" "FBS"  8.8 19172
    "022-6256" "FBS" 10.2 19423
    "022-6256" "FBS" 13.3 19662
    "022-6256" "FBS" 15.3 19909
    "022-6256" "FBS"  9.4 19909
    end
    format %d test_date

    Thanks so much in advance

    kind regards

    Adrian Shaba
    Last edited by Adrian Shaba; 17 Jan 2019, 12:44.

  • #2
    First, somehow the -dataex- output you show for the lab data got mangled. I have revised it so it produces something useable.

    You don't say exactly what you want the results of combining these data sets to look like, but I am assuming that you would like to pair up each food observation for a given patient with any lab tests that were done between the dates the patient was consuming that food.

    This is not a -merge- in the technical sense of the word. It is a join, with range restriction. Robert Picard's -rangejoin- command is designed for just that purpose. It is available from SSC. The only real obstacle to using it is that the patient id variable needs to have the same name in both data sets.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str12 id_patient str36 regimen long(food_start_date food_end_date)
    "022-5555" "     FO" 17342 18216
    "022-5555" "     FO" 18216 18596
    "022-5555" "     TO" 18596 18764
    "022-6256" "     TO" 19970 20495
    "022-6256" "     TO" 20495 20507
    "022-6256" "DEFAULT" 20507 21473
    "022-6256" "     TO" 17609 17784
    "022-6256" "     FO" 17784 18226
    "022-6256" "     FO" 18226 18301
    "022-6256" "     FO" 18301 18486
    "022-6256" "     TO" 18486 21300
    end
    format %d food_start_date
    format %d food_end_date
    tempfile food_intake
    save `food_intake'
    
    
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str12 id str10 test_name test_result long test_date
    "022-5555" "FBS"  8.1 17336
    "022-5555" "FBS" 10.0 17584
    "022-5555" "FBS" 13.2 17765
    "022-5555" "FBS"  4.6 18010
    "022-6256" "FBS"  3.3 16459
    "022-6256" "FBS"  7.8 17668
    "022-6256" "FBS" 15.6 18053
    "022-6256" "FBS"  6.6 18334
    "022-6256" "FBS"  7.8 18817
    "022-6256" "FBS"  8.8 19172
    "022-6256" "FBS" 10.2 19423
    "022-6256" "FBS" 13.3 19662
    "022-6256" "FBS" 15.3 19909
    "022-6256" "FBS"  9.4 19909
    end
    format %d test_date
    tempfile lab_data
    rename id id_patient
    save `lab_data'
    
    use `food_intake', clear
    rangejoin test_date food_start_date food_end_date using `lab_data', by(id_patient)

    Comment

    Working...
    X