Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create a group from a binary variable column

    Hi Statlists,

    I have a (very simple) question regarding labeling a group variable, using a binary variable column. I could not share the data here due to the confidentiality of the dataset, but my data looks like follows:
    Person Year Month Status
    A 2019 1 0
    A 2019 2 0
    A 2019 3 0
    B 2019 1 1
    B 2019 2 1
    B 2019 3 1
    C 2019 1 0
    C 2019 2 0
    C 2019 2 1
    C 2019 3 0
    C 2019 3 1
    C 2019 4 1
    I want to categorize a person by year and month. As you can see, there is no problem for individual A and B as they are categorized perfectly by the the Status variable. The problem occurs at an individual C in month 2 and 3 as a person has mixed status. Eventually, I want to collapse the data to look like follows:
    Person Year Month Status
    A 2019 1 0
    A 2019 2 0
    A 2019 3 0
    B 2019 1 1
    B 2019 2 1
    B 2019 3 1
    C 2019 1 0
    C 2019 2 2
    C 2019 3 2
    C 2019 3 1
    Question: How do I create the table like this?

    I have tried `bysort` and `egen`, but there is no way using these two command to detect the different between rows. What other commands should I try to create the table like this?

    Thank you for reading this question. I would appreciate any advice/suggestions the community may have!

    Best,
    Kob
    Last edited by Papungkorn Kitcharoenkarnkul; 17 Sep 2023, 14:05.

  • #2
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str1 person int year byte(month status)
    "A" 2019 1 0
    "A" 2019 2 0
    "A" 2019 3 0
    "B" 2019 1 1
    "B" 2019 2 1
    "B" 2019 3 1
    "C" 2019 1 0
    "C" 2019 2 0
    "C" 2019 2 1
    "C" 2019 3 0
    "C" 2019 3 1
    "C" 2019 4 1
    end
    
    bys person year month (status): assert inrange(_N, 1, 2) & inlist(status, 0, 1)
    bys person year month (status): replace status=2 if !status & status[_n+1]==1
    by person year month: keep if _n==1
    Res.:

    Code:
    . l, sepby(person year month)
    
         +--------------------------------+
         | person   year   month   status |
         |--------------------------------|
      1. |      A   2019       1        0 |
         |--------------------------------|
      2. |      A   2019       2        0 |
         |--------------------------------|
      3. |      A   2019       3        0 |
         |--------------------------------|
      4. |      B   2019       1        1 |
         |--------------------------------|
      5. |      B   2019       2        1 |
         |--------------------------------|
      6. |      B   2019       3        1 |
         |--------------------------------|
      7. |      C   2019       1        0 |
         |--------------------------------|
      8. |      C   2019       2        2 |
         |--------------------------------|
      9. |      C   2019       3        2 |
         |--------------------------------|
     10. |      C   2019       4        1 |
         +--------------------------------+
    Last edited by Andrew Musau; 17 Sep 2023, 15:31.

    Comment


    • #3
      Hi Andrew Musau,

      Sorry for taking time to reply and I greatly appreciate your help. It tooks me sometime to understand your code. After searching and trying your example by myself, I finally got it. It makes much more sense to do it this way.

      Again, I appreciate your help!

      Best,
      Kob

      Comment

      Working...
      X