Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Trouble with translating SAS code to STATA

    Hello,

    I was given SAS code but my work is all STATA (version 17). I've never used SAS before and am new to STATA so I am struggling to find the commands for this SAS code:

    by xid year month;
    if first.xid then count=0;
    count+1;
    run;

    What I am trying to do is assign a "count" value for each case/month that indicates that case's/month's order within all the cases/months for that participant.

    For example:
    xid year month count
    1 2020 9 1
    1 2020 10 2
    1 2020 11 3
    1 2020 12 4
    1 2021 1 5
    1 2021 2 6
    2 2020 8 1
    2 2020 9 2
    2 2020 10 3

    I have about 3990 xid, 90k obs, and on average 31 month of data per xid. Everyone started at different times so I will be using this count variable to create a new variable such as if count 1-12= Year1, count 13-24=Year2...etc.

    Thanks in advance!

  • #2
    I haven't used SAS since the late 1980's. So my interpretations of SAS code are guesswork. But ignoring that and just going on your description of what you are trying to do, it sounds like you want this:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte xid int year byte(month count)
    1 2020  9 1
    1 2020 10 2
    1 2020 11 3
    1 2020 12 4
    1 2021  1 5
    1 2021  2 6
    2 2020  8 1
    2 2020  9 2
    2 2020 10 3
    end
    
    gen int mdate = ym(year, month)
    format mdate %tm
    
    by xid (mdate), sort: gen counter = _n
    Notes:
    1. Strictly speaking it isn't necessary to create the mdate variable. But it will probably prove helpful, possibly indispensable, for what you want to do subsequently. If you don't create the mdate variable, you can change -by xid (mdate)- to -by xid (year month)- and get the same results.
    2. It's not a good idea to name a variable count, because -count- is a Stata command and also there is a -count()- function in Stata. So best not to further overload this token and risk confusion.

    In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      In words, the SAS code is asking for data to be sorted by xid, year then month (i.e., chronological order within xid). Then a counter is created that simply counts the number of observations within xid.

      In Stata, this can be done in a single step. See -help bysort- , -help gen- and -help system variables- for explanations for each component of the command.

      Code:
      clear *
      cls
      input int(xid year month count)
      1 2020 9 1
      1 2020 10 2
      1 2020 11 3
      1 2020 12 4
      1 2021 1 5
      1 2021 2 6
      2 2020 8 1
      2 2020 9 2
      2 2020 10 3
      end
      
      bysort xid (year month) : gen want = _n
      assert count==want
      Result

      Code:
           +-----------------------------------+
           | xid   year   month   count   want |
           |-----------------------------------|
        1. |   1   2020       9       1      1 |
        2. |   1   2020      10       2      2 |
        3. |   1   2020      11       3      3 |
        4. |   1   2020      12       4      4 |
        5. |   1   2021       1       5      5 |
           |-----------------------------------|
        6. |   1   2021       2       6      6 |
        7. |   2   2020       8       1      1 |
        8. |   2   2020       9       2      2 |
        9. |   2   2020      10       3      3 |
           +-----------------------------------+
      Last edited by Leonardo Guizzetti; 30 Jun 2023, 11:36. Reason: edit: crossed with #2

      Comment


      • #4
        Thank you both for the advice! This was helpful!

        Comment

        Working...
        X