Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Data restructuring

    I have a dataset of patients who had repeat treatments. Unfortunately instead of having a unique patient id the identifier was assigned to patient / treatment combinations and each patient separated by a blank line.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str22 uid double id
    "231598-231591-18432187"  1
    "231598-231591-27755918"  2
    ""                        3
    "231598-231591-18673626"  4
    "231598-231591-27917427"  5
    ""                        6
    "231598-231591-18619403"  7
    "231598-231591-27611910"  8
    ""                        9
    "231598-231591-18542583" 10
    "231598-231591-27642617" 11
    end
    What I need is patient 1: id 1 and 2, patient 2: id 4 and 5 etc. {I am told that some patients have more than 2 treatments.
    I attempted to solve this with the following code:
    Code:
    count
    local j = 1
    gen pid = .
    
    forvalues i = 1/`r(N)'{
    if uid != ""{
        replace pid = `j'
    }
    else{
        replace pid = 0
    local `j' = `j'+1
    }
    }
    If uid was not blank then the first if clause should assign the pid value j and this would continue until uid was blank and the second clause would assign pid the value 0 and increment j by 1 for use in the next loop. This code does not work and assigns pid = 1 for each patient.
    I would be grateful for advice on my coding error and suggestions for how to approach this problem.

    Thank you,
    Martyn

  • #2
    Code:
    gen long pid = sum(missing(uid)) + 1
    drop if missing(uid)

    Comment


    • #3
      Code:
      gen wanted= sum(!missing(uid[_n+1])) if !missing(uid)
      Note that the code relies on a sequence of non-missing observations identifying a patient.


      ADDED IN EDIT: Counting nonmissing values as in my code will generate correct results for a sequence of up to 2 nonmissing observations. You want to count missings as in Clyde's code.
      Last edited by Andrew Musau; 06 Dec 2020, 11:18.

      Comment


      • #4
        Thank you both for your assistance. I think that I was overcomplicating the situation.
        Martyn

        Comment

        Working...
        X