Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Restrict analysis to second occurrence of an event (Panel data)

    Dear Statalisters,


    I intend to analyse panel data (long format) on the effects of individuals taking a particular food item containing "FO" over time on blood sugar as part
    of my dissertation.

    I am interested in computing the incidence rates of the second occurrence of a high fasting blood sugar [FBS_cat =3 (>126mg/dl)]. Of note is that a series of FBS tests were
    done in clients at different times. However I would like to restrict my analysis to individuals with first high FBS followed by a normal FBS [ FBS_cat ==1(<100mg/dl)] .

    My aim and to conduct survival analysis for clients with a second high FBS (rebound) from the time it was recorded as normal [ FBS_cat ==1(<100mg/dl)] .


    Step 1:
    WITHIN A PANEL:
    1. How do I only keep individuals who had a first high FBS [FBS_cat =3 (>126mg/dl)] followed by a normal FBS [ FBS_cat ==1(<100mg/dl)] ?
    2 How do I exclude observations with subsequent high FBS [FBS_cat =3 (>126mg/dl)] in those with high FBS after the second?

    Step 2:
    WITHIN A PANEL:
    3. How do I also keep the highest FBS lab test date if normal FBS [ FBS_cat ==1(<100mg/dl)] is maintained after first high FBS?
    4. How do I also keep the FBS lab test date for the second high FBS for those with rebound?


    My goal is to clean the dataset to achieve censoring dates for each client and this would help me calculate the pyrs of follow-up. also i intend to generate a
    dichotomous outcome variable of high FBS rebound (yes/no). With this I can conduct Kaplan meir survival analysis )including cox proportional hazard regression modelling)

    Information on the start date and end dates are given for food item "FO" for each individual. However,
    the a series of lab tests (FBS_cat) were taken between each start and end of each food item.And the FBS test date


    Code:
    * Example generated by -dataex-. To install: ssc install dataex clear input str12 id_patient str6 Food_item long(Food_start_date Food_end_date) float FBS_cat long FBS_test_date "05-602075" "FO" 20108 21202 1 20248 "05-602075" "FO" 20108 21202 1 20615 "05-602075" "FO" 20108 21202 2 21062 "05-602075" "FO" 20108 21202 2 21090 "05-602075" "FO" 20108 21202 1 21146 "05-603889" "FO" 20370 21434 1 20496 "05-603889" "FO" 20370 21434 1 21077 "05-603889" "FO" 20370 21434 1 21406 "05-61271" "FO" 20418 20905 2 20613 "05-61271" "FO" 20418 20905 1 20761 "05-61271" "FO" 20418 20905 1 20873 "05-61788" "FO" 20360 21419 3 20360 "05-61788" "FO" 20360 21419 1 20472 "05-61788" "FO" 20360 21419 3 20668 "05-61788" "FO" 20360 21419 1 20836 "05-61788" "FO" 20360 21419 1 20948 "05-61788" "FO" 20360 21419 1 21335 "05-619584" "FO" 20214 . 1 20289 "05-619584" "FO" 20214 . 3 20660 "05-619584" "FO" 20214 . 1 20772 "05-619584" "FO" 20214 . 1 21094 "05-619584" "FO" 20214 . 1 21447 "05-620831" "FO" 20426 21452 3 20426 "05-620831" "FO" 20426 21452 1 20499 "05-620831" "FO" 20426 21452 1 20779 "05-620831" "FO" 20426 21452 1 21143 "05-620831" "FO" 20426 21452 1 21423 "05-626996" "FO" 20270 20382 1 20326 "05-628103" "FO" 20380 20492 3 20380 "05-629098" "FO" 20303 . 1 20471 "05-629098" "FO" 20303 . 1 20835 "05-629098" "FO" 20303 . 1 21199 "05-629675" "FO" 20340 20741 1 20482 "05-629675" "FO" 20340 20741 1 20692 "05-633221" "FO" 20116 20118 . . "05-633221" "FO" 20118 20285 1 20250 "05-633221" "FO" 20285 20369 . . "05-633649" "FO" 20243 20250 . . "05-633649" "FO" 20271 21453 1 20404 "05-633649" "FO" 20271 21453 3 20768 "05-633649" "FO" 20271 21453 1 20907 "05-633649" "FO" 20271 21453 1 21132 "05-637006" "FO" 20209 21441 1 20209 "05-637006" "FO" 20209 21441 1 20485 "05-637006" "FO" 20209 21441 1 20629 "05-637006" "FO" 20209 21441 1 20776 "05-637006" "FO" 20209 21441 1 21048 "05-637006" "FO" 20209 21441 1 21378 "05-638065" "FO" 20151 20172 1 20151 "05-638065" "FO" 20172 . 1 20261 "05-638065" "FO" 20172 . 1 20500 "05-638065" "FO" 20172 . 1 20744 "05-638065" "FO" 20172 . 3 20856 "05-638065" "FO" 20172 . 1 20983 "05-638065" "FO" 20172 . 1 21222 "05-643474" "FO" 20368 . 1 20396 "05-643474" "FO" 20368 . 3 20737 "05-643474" "FO" 20368 . 1 20782 "05-643474" "FO" 20368 . 1 21136 "05-643702" "FO" 20234 21353 1 20395 "05-643702" "FO" 20234 21353 2 20779 "05-643702" "FO" 20234 21353 1 20975 "05-643702" "FO" 20234 21353 1 21129 "05-644578" "FO" 20220 . 1 20388 "05-644578" "FO" 20220 . 1 20724 "05-644578" "FO" 20220 . 1 21088 "05-644578" "FO" 20220 . 3 21446 "05-645074" "FO" 20171 20178 . . "05-650681" "FO" 20194 20195 . . "05-650681" "FO" 20195 20223 3 20223 "05-650681" "FO" 20223 20635 3 20223 "05-650681" "FO" 20223 20635 3 20384 "05-650681" "FO" 20223 20635 3 20461 "05-650681" "FO" 20223 20635 3 20489 "05-650681" "FO" 20223 20635 3 20517 "05-652122" "FO" 20312 20395 . . "05-65390" "FO" 20163 20311 1 20254 "05-65390" "FO" 20311 . 2 20339 "05-65390" "FO" 20311 . 1 20473 "05-65390" "FO" 20311 . 1 20718 "05-65390" "FO" 20311 . 2 20886 "05-65390" "FO" 20311 . 1 20998 "05-65390" "FO" 20311 . 2 21306 "05-65390" "FO" 20311 . 2 21425 "05-654580" "FO" 20142 20870 1 20282 "05-654580" "FO" 20142 20870 1 20646 "05-654580" "FO" 20870 . 1 21013 "05-654580" "FO" 20870 . 1 21374 "05-654657" "FO" 20404 20516 1 20488 "05-654657" "FO" 20516 . 1 20712 "05-654657" "FO" 20516 . 1 21076 "05-654657" "FO" 20516 . 1 21195 "05-654657" "FO" 20516 . 1 21468 "05-65706" "FO" 20135 20285 3 20163 "05-65706" "FO" 20135 20285 1 20256 "05-65706" "FO" 20285 20849 3 20677 "05-65706" "FO" 20285 20849 1 20761 "05-663008" "FO" 20333 20494 . . "05-666606" "FO" 20117 . 1 20179 "05-666606" "FO" 20117 . 1 20179 end format %d Food_start_date format %d Food_end_date format %d test_date label values FBS_cat FBS_cat label def FBS_cat 1 "<100 mg/dl", modify label def FBS_cat 2 "100-125 mg/dl", modify label def FBS_cat 3 ">126 mg/dl", modify [/CODE]

  • #2
    First, somehow your -dataex- output got garbled. Please repost with a clean data example.

    Also, I'm not sure I understand what you want. Please provide worked examples for all four of your requests, explaining how you arrived at those results.



    Comment


    • #3
      I believe the dataex was copied from a previous topic and reposted here lacking the CODE tags. Here I've recopied it and included CODE tags.
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str12 id_patient str6 Food_item long(Food_start_date Food_end_date) float FBS_cat long  FBS_test_date
      "05-602075" "FO"  20108 21202 1 20248
      "05-602075" "FO"  20108 21202 1 20615
      "05-602075" "FO"  20108 21202 2 21062
      "05-602075" "FO"  20108 21202 2 21090
      "05-602075" "FO"  20108 21202 1 21146
      "05-603889" "FO"  20370 21434 1 20496
      "05-603889" "FO"  20370 21434 1 21077
      "05-603889" "FO"  20370 21434 1 21406
      "05-61271"  "FO"  20418 20905 2 20613
      "05-61271"  "FO"  20418 20905 1 20761
      "05-61271"  "FO"  20418 20905 1 20873
      "05-61788"  "FO"  20360 21419 3 20360
      "05-61788"  "FO"  20360 21419 1 20472
      "05-61788"  "FO"  20360 21419 3 20668
      "05-61788"  "FO"  20360 21419 1 20836
      "05-61788"  "FO"  20360 21419 1 20948
      "05-61788"  "FO"  20360 21419 1 21335
      "05-619584" "FO"  20214     . 1 20289
      "05-619584" "FO"  20214     . 3 20660
      "05-619584" "FO"  20214     . 1 20772
      "05-619584" "FO"  20214     . 1 21094
      "05-619584" "FO"  20214     . 1 21447
      "05-620831" "FO"  20426 21452 3 20426
      "05-620831" "FO"  20426 21452 1 20499
      "05-620831" "FO"  20426 21452 1 20779
      "05-620831" "FO"  20426 21452 1 21143
      "05-620831" "FO"  20426 21452 1 21423
      "05-626996" "FO"  20270 20382 1 20326
      "05-628103" "FO"  20380 20492 3 20380
      "05-629098" "FO"  20303     . 1 20471
      "05-629098" "FO"  20303     . 1 20835
      "05-629098" "FO"  20303     . 1 21199
      "05-629675" "FO"  20340 20741 1 20482
      "05-629675" "FO"  20340 20741 1 20692
      "05-633221" "FO"  20116 20118 .     .
      "05-633221" "FO"  20118 20285 1 20250
      "05-633221" "FO"  20285 20369 .     .
      "05-633649" "FO"  20243 20250 .     .
      "05-633649" "FO"  20271 21453 1 20404
      "05-633649" "FO"  20271 21453 3 20768
      "05-633649" "FO"  20271 21453 1 20907
      "05-633649" "FO"  20271 21453 1 21132
      "05-637006" "FO"  20209 21441 1 20209
      "05-637006" "FO"  20209 21441 1 20485
      "05-637006" "FO"  20209 21441 1 20629
      "05-637006" "FO"  20209 21441 1 20776
      "05-637006" "FO"  20209 21441 1 21048
      "05-637006" "FO"  20209 21441 1 21378
      "05-638065" "FO"  20151 20172 1 20151
      "05-638065" "FO"  20172     . 1 20261
      "05-638065" "FO"  20172     . 1 20500
      "05-638065" "FO"  20172     . 1 20744
      "05-638065" "FO"  20172     . 3 20856
      "05-638065" "FO"  20172     . 1 20983
      "05-638065" "FO"  20172     . 1 21222
      "05-643474" "FO"  20368     . 1 20396
      "05-643474" "FO"  20368     . 3 20737
      "05-643474" "FO"  20368     . 1 20782
      "05-643474" "FO"  20368     . 1 21136
      "05-643702" "FO"  20234 21353 1 20395
      "05-643702" "FO"  20234 21353 2 20779
      "05-643702" "FO"  20234 21353 1 20975
      "05-643702" "FO"  20234 21353 1 21129
      "05-644578" "FO"  20220     . 1 20388
      "05-644578" "FO"  20220     . 1 20724
      "05-644578" "FO"  20220     . 1 21088
      "05-644578" "FO"  20220     . 3 21446
      "05-645074" "FO"  20171 20178 .     .
      "05-650681" "FO"  20194 20195 .     .
      "05-650681" "FO"  20195 20223 3 20223
      "05-650681" "FO"  20223 20635 3 20223
      "05-650681" "FO"  20223 20635 3 20384
      "05-650681" "FO"  20223 20635 3 20461
      "05-650681" "FO"  20223 20635 3 20489
      "05-650681" "FO"  20223 20635 3 20517
      "05-652122" "FO"  20312 20395 .     .
      "05-65390"  "FO"  20163 20311 1 20254
      "05-65390"  "FO"  20311     . 2 20339
      "05-65390"  "FO"  20311     . 1 20473
      "05-65390"  "FO"  20311     . 1 20718
      "05-65390"  "FO"  20311     . 2 20886
      "05-65390"  "FO"  20311     . 1 20998
      "05-65390"  "FO"  20311     . 2 21306
      "05-65390"  "FO"  20311     . 2 21425
      "05-654580" "FO"  20142 20870 1 20282
      "05-654580" "FO"  20142 20870 1 20646
      "05-654580" "FO"  20870     . 1 21013
      "05-654580" "FO"  20870     . 1 21374
      "05-654657" "FO"  20404 20516 1 20488
      "05-654657" "FO"  20516     . 1 20712
      "05-654657" "FO"  20516     . 1 21076
      "05-654657" "FO"  20516     . 1 21195
      "05-654657" "FO"  20516     . 1 21468
      "05-65706"  "FO"  20135 20285 3 20163
      "05-65706"  "FO"  20135 20285 1 20256
      "05-65706"  "FO"  20285 20849 3 20677
      "05-65706"  "FO"  20285 20849 1 20761
      "05-663008" "FO"  20333 20494 .     .
      "05-666606" "FO"  20117     . 1 20179
      "05-666606" "FO"  20117     . 1 20179
      end
      format %d Food_start_date
      format %d Food_end_date
      format %d test_date
      label values FBS_cat FBS_cat
      label def FBS_cat 1 "<100 mg/dl", modify
      label def FBS_cat 2 "100-125 mg/dl", modify
      label def FBS_cat 3 ">126 mg/dl", modify
      I will add to Clyde's question the following question. You write

      individuals taking a particular food item containing "FO" over time
      and in your sample data, every observation includes Food_item "FO" over a range of dates that span all the test dates for that individual.

      Now, it appears that for the purposes of the four items you want to calculate, the data on the food item and its start and end dates makes no difference. If I were to write code to try to accomplish what you ask for, I would ignore those three variables.

      But perhaps your example data does not represent the actual data to which you will apply the code suggested to you.

      In your actual data, are there individuals who have some other food item, or have more than one food item? And if there are more than one food item, with different start and end dates, are we to ignore all the observations for the other food items?

      Comment


      • #4
        Thanks so much Clyde and William for your contributions this far. Your questions have also provided me with insights of what i need to reconsider.

        Responding to the questions raised by Wiliam: let me agree with you and state that all individuals in the study were on different food combinations but I am only interested in those that were taking "FO"-containing food. I managed to isolate these individuals from the time they started to the time they stopped or if they are still on the food item (captured as missing in the food_end_date variable). My research question focuses on individuals taking FO irrespective of the other food combination in the food items they were taking.


        To provide more clarity I have manipulated the data above to provide some scenarios that may occur which do not appear in the earlier dataex sampled data. I have also exluded the food-item variable which in this case is constant for all individuals in this subset.

        In this study "Rebound is defined as having a high FBS after a normalised FBS with prior high FBS (or 2 consecutive medium high FBS {category 2} within 6months)"

        Considering the data below:

        **ALL individuals with the same start date and lab date need to be dropped since this indicates that the test was done at baseline**

        DATASET A:

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str12 id_patient str6 Food_item long(Food_start_date Food_end_date) float FBS_cat long  FBS_test_date
        "05-602075" 20108 21202 1 20248
        "05-602075" 20108 21202 1 20615
        "05-602075" 20108 21202 2 21062
        "05-602075" 20108 21202 2 21090
        "05-602075" 20108 21202 1 21146
        "05-603889" 20370 21434 1 20496
        "05-603889" 20370 21434 1 21077
        "05-603889" 20370 21434 1 21406
        "05-61271"  20418 20905 3 20613
        "05-61271"  20418 20905 1 20761
        "05-61271"  20418 20905 3 20873
        "05-61788"  20360 21419 3 20360
        "05-61788"  20360 21419 1 20472
        "05-61788"  20360 21419 3 20668
        "05-61788"  20360 21419 1 20836
        "05-61788"  20360 21419 3 20948
        "05-61788"  20360 21419 1 21335
        "05-619584" 20214     . 1 20289
        "05-619584" 20214     . 3 20660
        "05-619584" 20214     . 1 20772
        "05-619584" 20214     . 1 21094
        "05-619584" 20214     . 1 21447
        "05-620831" 20426 21452 1 20426
        "05-620831" 20426 21452 2 20499
        "05-620831" 20426 21452 1 20779
        "05-620831" 20426 21452 1 21143
        "05-620831" 20426 21452 1 21423
        end
        format %d Food_start_date
        format %d Food_end_date
        format %d test_date
        label values FBS_cat FBS_cat
        label def FBS_cat 1 "<100 mg/dl", modify
        label def FBS_cat 2 "100-125 mg/dl", modify
        label def FBS_cat 3 ">126 mg/dl", modify
        After dropping these individuals "drop if Food_start_date==FBS_test_date" below is the our working dataset example:

        DATASET B:

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str12 id_patient str6 Food_item long(Food_start_date Food_end_date) float FBS_cat long  FBS_test_date
        "05-602075" 20108 21202 1 20248
        "05-602075" 20108 21202 1 20615
        "05-602075" 20108 21202 2 21062
        "05-602075" 20108 21202 2 21090
        "05-602075" 20108 21202 1 21146
        "05-603889" 20370 21434 1 20496
        "05-603889" 20370 21434 1 21077
        "05-603889" 20370 21434 1 21406
        "05-61271"  20418 20905 3 20613
        "05-61271"  20418 20905 1 20761
        "05-61271"  20418 20905 3 20873
        "05-61788"  20360 21419 1 20472
        "05-61788"  20360 21419 3 20668
        "05-61788"  20360 21419 1 20836
        "05-61788"  20360 21419 3 20948
        "05-61788"  20360 21419 1 21335
        "05-619584" 20214     . 1 20289
        "05-619584" 20214     . 3 20660
        "05-619584" 20214     . 1 20772
        "05-619584" 20214     . 1 21094
        "05-619584" 20214     . 1 21447
        "05-620831" 20426 21452 2 20499
        "05-620831" 20426 21452 1 20779
        "05-620831" 20426 21452 1 21143
        "05-620831" 20426 21452 1 21423
        end
        format %d Food_start_date
        format %d Food_end_date
        format %d test_date
        label values FBS_cat FBS_cat
        label def FBS_cat 1 "<100 mg/dl", modify
        label def FBS_cat 2 "100-125 mg/dl", modify
        label def FBS_cat 3 ">126 mg/dl", modify
        With the data above my final subset is to have only those that had a subsequent high FBS reading (i. category 3) or 2 consecutive medium high FBS {ii.category 2} within 6months
        after a normalised FBS (Catergory 1)

        The "baseline date" is to be the first FBS_test_date when the FBS normalised (category 1) after the above.

        The "censoring date" is to be the data closure date (31/12/2018) if the Food_end_date== missing or the last recorded Food_end_date or the date they had (subsequent) rebound high FBS (category 3 again or 2 consecutive medium high FBS {category 2} within 6months)

        From the second data above DATA B this is what I am look forward to achieve:

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str12 id_patient str6 Food_item long(baseline_date) float hiFBS_rebound long  censor_date
        
        "05-602075" 21090 0 21146
        "05-61271"  20761 0 20905
        "05-61788"  20836 1 20948
        "05-619584" 20772 0 21447
        
        end
        format %d baseline_date
        format %d censor_date
        label values hiFBS_rebound 
        label def hiFBS_rebound 0 "no", modify
        label def hiFBS_rebound 1 "yes", modify
        I believe with this subset I can calculate the the person-time for each individual if i yield somthing like this.

        Note: other individuals have been excluded since they do not meet the criteria above (have at most a single high FBS)....remaining with the 4 individuals as part of the final subset for the analysis.

        Apologies this is long but had to provide clarity on this.

        Thanks so much.

        Nelson Mandela "Education is the most powerful weapon which you can use to change the world."


        Comment


        • #5
          Well, I believe I have code that does what you say you want, but its results disagree with what you show in your third -dataex-. Looking carefully over the data in DATA B, I believe your calculations are wrong and mine are right. First, here's my code

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input str12 id_patient long(Food_start_date Food_end_date) float FBS_cat long  FBS_test_date
          "05-602075" 20108 21202 1 20248
          "05-602075" 20108 21202 1 20615
          "05-602075" 20108 21202 2 21062
          "05-602075" 20108 21202 2 21090
          "05-602075" 20108 21202 1 21146
          "05-603889" 20370 21434 1 20496
          "05-603889" 20370 21434 1 21077
          "05-603889" 20370 21434 1 21406
          "05-61271"  20418 20905 3 20613
          "05-61271"  20418 20905 1 20761
          "05-61271"  20418 20905 3 20873
          "05-61788"  20360 21419 1 20472
          "05-61788"  20360 21419 3 20668
          "05-61788"  20360 21419 1 20836
          "05-61788"  20360 21419 3 20948
          "05-61788"  20360 21419 1 21335
          "05-619584" 20214     . 1 20289
          "05-619584" 20214     . 3 20660
          "05-619584" 20214     . 1 20772
          "05-619584" 20214     . 1 21094
          "05-619584" 20214     . 1 21447
          "05-620831" 20426 21452 2 20499
          "05-620831" 20426 21452 1 20779
          "05-620831" 20426 21452 1 21143
          "05-620831" 20426 21452 1 21423
          end
          format %d Food_start_date
          format %d Food_end_date
          format %d FBS_test_date
          label values FBS_cat FBS_cat
          label def FBS_cat 1 "<100 mg/dl", modify
          label def FBS_cat 2 "100-125 mg/dl", modify
          label def FBS_cat 3 ">126 mg/dl", modify
          
          forvalues i= 2/3 {
              gen fbs_cat_`i' = `i'.FBS_cat
          }
          
          sort id_patient FBS_test_date
          
          //    FIND OBSERVATIONS WITH A NORMAL FBS FOLLOWED, WITHIN 6 MONTHS (= 183 DAYS)
          //    BY AT LEAST ONE CATEGORY 3 RESULT OR AT LEAST TWO CATEGORY 2 RESULTS
          rangestat (sum) fbs_cat_2 fbs_cat_3, by(id_patient) interval(FBS_test_date 1 183)
          replace fbs_cat_2_sum = 0 if missing(fbs_cat_2_sum)
          replace fbs_cat_3_sum = 0 if missing(fbs_cat_3_sum)
          gen byte keeper = FBS_cat == 1 & (fbs_cat_2_sum > 1 | fbs_cat_3_sum > 0)
          
          //    SHOW THE OBSERVATIONS THAT WILL BE KEPT
          gen test_date = FBS_test_date
          list id_patient test_date if keeper, noobs clean nolabel
          Note: This code requires -rangestat-, written by Robert Picard, Nick Cox, and Roberto Ferrer, available from SSC.

          You can follow this by -keep if keeper- to reduce to the data set you want.

          Now let's go over how your results differ from mine.

          You show "05-602075" 21090 (I'm only showing the ID and your baseline date, which should match my test_date). But that can't be a correct baseline date, because in that observation FBS_cat is 2 not 1. In fact, no observation from 05-602075 will qualify because the category 2 observations are all more than a year after the last category 1 observation. The six-month limit is not met.

          You show "05-619584" 20772. This observation has FBS_cat 1, but the only observation with FBS_cat other than 1 for this patient occurs before date 20772 (it occurs on date 20660). So this also fails. And looking over this patient's observations in their entirety, the only FBS_cat1 that is followed by an FBS_cat > 1 is the one 20289: but that following FBS_cat 3 observation is more than a year later, so again the 6 month inclusion limit is not met.

          So either you need to review your data or your calculations, or you need to clarify the criterion you want.




          Comment

          Working...
          X