Hello,
I'm currently writing my bachelor's thesis where I have to compare two different treatment groups in regards to several different blood test parameters, BMI and so forth. This is my first time working with Stata (and with data of this sort). The data I'm working with is gathered automatically from their electronic journals. I have provided a fabricated and simplified example below. My real dataset consists of 32 variables with 1158 observations. As far as I've understood, the data is in "long form". Each patient has their own unique ID (for example 16 and 25) and have a different number of observations depending on their past hospital visits (some patients might have 1, others 7 and so on).
I'm having some trouble figuring out how I can "categorize" each patient into their correct treatment group. Per the example, patient 16 is in the treatment groups "transdermal" and "begge" with a dosage of 350, while patient 25 is in the treatment group "transdermal" with a dosage of 200. How do I make sure those treatment groups are linked to all observations belonging to their respective patient-IDs? Should I put 1/0 and dosage in every "line" belonging to each patient? And if so, won't that skewer my data so it looks as if I have 1158 different patients receiving treatment (and not, say, 200 patients with different number of observations)?
As of right now, if I were to analyse observations in the group "transdermal", I would only include patient 16's data from 1/30/2022 and patient 25's data from 9/28/2017, right?
Furthermore, how would you go about visualizing this? I would be interested in visualizing each treatment groups' status in regards to a certain parameter over time - for example, the LH-level of patients receiving transdermal vs oral treatment over time (in the same plot). As far as my googling skills have taken me, I would have to use the xtline and overlay commands. However, I'm not interested in the specific dates of the patient visits as a time unit - it doesn't have to be that specific, more in the likes of "visit 1, visit 2, visit 3" and so forth. Do I have to manually rewrite all dates to that form or? Is there an easier way?
I apologize in advance if my requests are confusing or perhaps seem a little ignorant. As I've stated, this is a first for me, and I fear I might be in a little over my head :-)
I hope you get the drift of it all and are able to help - I would be eternally grateful!
Thanks in advance
I'm currently writing my bachelor's thesis where I have to compare two different treatment groups in regards to several different blood test parameters, BMI and so forth. This is my first time working with Stata (and with data of this sort). The data I'm working with is gathered automatically from their electronic journals. I have provided a fabricated and simplified example below. My real dataset consists of 32 variables with 1158 observations. As far as I've understood, the data is in "long form". Each patient has their own unique ID (for example 16 and 25) and have a different number of observations depending on their past hospital visits (some patients might have 1, others 7 and so on).
I'm having some trouble figuring out how I can "categorize" each patient into their correct treatment group. Per the example, patient 16 is in the treatment groups "transdermal" and "begge" with a dosage of 350, while patient 25 is in the treatment group "transdermal" with a dosage of 200. How do I make sure those treatment groups are linked to all observations belonging to their respective patient-IDs? Should I put 1/0 and dosage in every "line" belonging to each patient? And if so, won't that skewer my data so it looks as if I have 1158 different patients receiving treatment (and not, say, 200 patients with different number of observations)?
As of right now, if I were to analyse observations in the group "transdermal", I would only include patient 16's data from 1/30/2022 and patient 25's data from 9/28/2017, right?
Furthermore, how would you go about visualizing this? I would be interested in visualizing each treatment groups' status in regards to a certain parameter over time - for example, the LH-level of patients receiving transdermal vs oral treatment over time (in the same plot). As far as my googling skills have taken me, I would have to use the xtline and overlay commands. However, I'm not interested in the specific dates of the patient visits as a time unit - it doesn't have to be that specific, more in the likes of "visit 1, visit 2, visit 3" and so forth. Do I have to manually rewrite all dates to that form or? Is there an easier way?
I apologize in advance if my requests are confusing or perhaps seem a little ignorant. As I've stated, this is a first for me, and I fear I might be in a little over my head :-)
I hope you get the drift of it all and are able to help - I would be eternally grateful!
Thanks in advance
Code:
* Example generated by -dataex-. For more info, type help dataex clear input long DWEKBorger double Konsultationstart str61 Diagnose double(BHæmoglobin strPLH strPFSH) byte(Alder Transdermal) double Transdermaldosisugentlig byte Peroral double Peroraldosisdaglig byte Begge 16 1.7624736e+12 "Turners syndrom" 9.3 11.6 31 20 . . . . . 16 1.8016992e+12 "Turners syndrom" . . . 21 . . . . . 16 1.95912e+12 "Turners syndrom" 9.2 3.3 10 22 1 350 0 . 1 25 1.822176e+12 "45,X/46,XX" . . . 50 1 200 0 . 1 25 1.955232e+12 "45,X/46,XX" 9.1 5.6 . 53 . . . . . end format %tcnn/dd/ccYY_hh:MM Konsultationstart
Comment