Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • keeping only some values from string variables

    Hi,
    I have a set of 24 variables reporting information for some treatments for a sample of patients. Both patient id and treatment variables are string. I am trying to figure out how I could keep only some specific values from those string treatment variables. The rest of treatments contained in those variables, I would be keen to delete or recode to missing. I would be interested in deleting the rest or recoding into missing.

    The problem I envisage is that some times the treatment I am interested in is duplicated in other treatment variables (over the 24 different variables to reflect treatments). I would be interested in keeping only a set of treatments the tI am interested in.

    So far, the code I have tried is as follows:

    * Define a macro to ascertain the range of treatments of interest
    local i of A B C D E F G F H

    * Run the loop over the patient id and treatment variables

    foreach num in 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 {
    bys id_patient: egen disease="." if treatment_`num'!="`i'"
    replace disease="ì'" if treatment_`num'=="ì'"
    drop if disease=="."
    }

    Kind regards,
    Alberto


  • #2
    You say that you tried this code. What does that mean? What happened? Stata wouldn't get past the first time around your one loop, as the egen statement is illegal.

    The spirit of your code is to loop over treatment variables and treatments updating a single variable disease. But you only show one loop.

    You're trying to create a new variable each time round the loop, but second time around it already exists, and as said the egen syntax makes no sense here. Further, dropping observations because a particular variable contains no information sounds like something you should not want to do.

    The approach makes sense to me if and only if patients suffer from at most one disease.

    I think you need to back up and show a simplified data example with some patients, some of the treatment variables, and the outcome you want to generate.

    Comment

    Working...
    X