Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sorting by group + time

    Hi all I have this data:

    newid = Patientid
    admindate = Date admitted into hospital
    Procedureid = The code for the surgery
    indexno = If a patient had a procedure (found in procedureid) then this would be = 1 otherwise missing

    Aim: To sort the data according to the Patient Id + Episode No + Admission date.
    (note episode no - which I generate later is a sequential no given to any patient admitted into hospital)
    Therefore I would like to view the Group of Patients In order of admission date in ascending order, so the earlier admission comes first, thus of course the patient would be younger.


    To help you understand, this is what I would like to see (for demonstation purposes)

    Patient A - Date of Admission 1 Mar 1980 - Episode 1 (n=1)
    Patient A - Date of Admission 1 Jun 1990 - Episode 2 (n=2)

    Patient B - Date of Admission 1 Apr 1981 - Episode 1 (n=1)
    Patient B - Date of Admission 4 Mar 1985 - Episode 2 (n=2)




    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str10(AdmissionDate ReleaseDate) str3 newid float(admindate Procedureid indexno episodeno)
    "5 Apr 1960" "7 Apr 1960" "A1"   95 123 1 1
    "1 Apr 1960" "5 Apr 1960" "A1"   91   . . 2
    "1 Nov 1960" "1 Nov 1960" "M22" 305 124 1 1
    "1 Feb 1960" "4 Feb 1960" "N1"   31 125 1 1
    "1 Jan 1960" "4 Jan 1960" "Z2"    0 126 . 1
    end
    format %td admindate

    Code used

    Step 1: Converted to stata dates and then into readable form
    gen admindate = date(AdmissionDate, "DMY")

    format admindate %td

    Step 2:
    Generate a _n number for each time the patient (newid) was admitted into hospital


    sort newid
    bys newid (Procedureid indexno admindate): gen episodeno = _n



    The issue is, it is not sorting my admindate as you can see 5 Apr 1960 is before 1 Apr 1960

    Can you explain to me why this is happening?
    I have tried looking at several posts to understand this. I have checked my data several time.
    I can not understand.





    Last edited by Martin Imelda Borg; 29 Sep 2022, 06:13.

  • #2
    bys newid (Procedureid indexno admindate)
    The sorting variables for groups defined by newid are Procedureid indexno and admindate. What appears on the left-hand side takes precedence over what appears on the right-hand side. In this case, Stata sorts by indexno before sorting by admindate, and "1" is sorted before missing.

    Comment


    • #3
      you have set up a hierarchy of conditions with sorting first by newid and then by Procedureid and then by indexno and only then by admindate - thus observations will be sorted by admindate within tied values of newid and Procedureid and indexno - but, within the 2 observations for newid, Procedureid and indexno are each missing for one observation and therefore you do not have tied values and thus admindate for the line with missing values is sorted after the admindate for the observation without missing values; remember that, in Stata, a missing value for a numeric variables, such as Procedureid and indexno, will always sort after any non-missing value

      Comment


      • #4
        Originally posted by Rich Goldstein View Post
        you have set up a hierarchy of conditions with sorting first by newid and then by Procedureid and then by indexno and only then by admindate - thus observations will be sorted by admindate within tied values of newid and Procedureid and indexno - but, within the 2 observations for newid, Procedureid and indexno are each missing for one observation and therefore you do not have tied values and thus admindate for the line with missing values is sorted after the admindate for the observation without missing values; remember that, in Stata, a missing value for a numeric variables, such as Procedureid and indexno, will always sort after any non-missing value



        Thank you very much, such a small issue and it solved my problem.

        With regards to ' remember that, in Stata, a missing value for a numeric variables, such as Procedureid and indexno, will always sort after any non-missing value[/QUOTE]'

        What about if the admidate is . (missing) - as its a date will the same rule apply?
        Many thanks

        Comment


        • #5
          A time variable in Stata is a numerical variable. In any case, you could experiment to test your hypothesis.

          Comment

          Working...
          X