Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting observations within groups

    I have multiple types of patient-reported data collected across a variable number of patient encounters. The dates are included in this data but there is no variable linking the data collected on the same day. I would like to count and label the number of visits that patients attended, using a combination of patient study ID and date. The "visit" column is the variable I am seeking to create.
    Patient ID Date Questionnaire Visit
    1001 1/1/2021 A 1
    1001 1/1/2021 B 1
    1001 1/4/2021 B 2
    1002 2/3/2021 A 1
    1002 2/3/2021 B 1
    1003 3/5/2021 A 1
    1003 3/7/2021 A 2
    I have tried this using the egen group command but this just counts the number of distinct patient ID+date combinations across the entire dataset, rather than by each patient. I have similar results using the _n or _N functions.

  • #2
    I think what you want is this:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int patientid str9 date str2 questionnaire
    1001 "1/1/2021 " "A "
    1001 "1/1/2021 " "B "
    1001 "1/4/2021 " "B "
    1002 "2/3/2021 " "A "
    1002 "2/3/2021 " "B "
    1003 "3/5/2021 " "A "
    1003 "3/7/2021 " "A "
    end
    
    //  CREATE A REAL STATA NUMERIC DATE VARIABLE
    gen _date = daily(date, "MDY"), after(date)
    assert missing(_date) == missing(date)
    format _date %td
    drop date
    rename _date date
    
    //  CREATE DESIRED COUNTER VARIABLE
    by patientid (date), sort: gen wanted = sum(date != date[_n-1])
    Note: I cannot tell from the particular dates shown in the example data whether they are month/day/year or day/month/year. I have assumed the former. If that is wrong, change the -gen _date = ... - command accordingly.

    In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Last edited by Clyde Schechter; 17 Aug 2022, 12:51.

    Comment


    • #3
      This is on the money, thank you! Will be sure to use -dataex- in future posts.

      Comment

      Working...
      X