Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • To identify level 2 variable by level 3 variable in hierarchical dataset

    Hi everyone,

    I have a hierarchical dataset with more than 70,000 episodes (level 3 variable) that could be grouped by ID (level 1 variable). Within each ID there are cases consisting of different numbers of related episodes. Multiple is a variable that counts the number of episodes within a particular case and is what I mean by level 3 variable (my dataset looks like below). I need to create a level 2 variable (new variable) that counts the number of cases by unique number. Is there any solution?
    ID multiple new variable
    id1
    id1 1 1
    id1 2 1
    id2
    id2
    id2 1 2
    id2 2 2
    id2 1 3
    id2 2 3
    id2 3 3
    id3
    id3
    id3
    id3
    id4
    id4 1 4
    id4 2 4
    id4 3 4
    id4 4 4
    id4 1 5
    id4 2 5

    Ideally, I need to count these unique cases within ID group and take into account missing values which are simply cases with 1 episode. So to create this new variable 2:
    ID multiple new variable new variable 2
    id1 1
    id1 1 1 2
    id1 2 1 2
    id2 1
    id2 2
    id2 1 2 3
    id2 2 2 3
    id2 1 3 4
    id2 2 3 4
    id2 3 3 4
    id3 1
    id3 2
    id3 3
    id3 4
    id4 1
    id4 1 4 2
    id4 2 4 2
    id4 3 4 2
    id4 4 4 2
    id4 1 5 3
    id4 2 5 3

    Thank you!

    Kind regards,
    Svetlana

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str4 id byte multiple
    "id1" .
    "id1" 1
    "id1" 2
    "id2" .
    "id2" .
    "id2" 1
    "id2" 2
    "id2" 1
    "id2" 2
    "id2" 3
    "id3" .
    "id3" .
    "id3" .
    "id3" .
    "id4" .
    "id4" 1
    "id4" 2
    "id4" 3
    "id4" 4
    "id4" 1
    "id4" 2
    end
    
    gen `c(obs_t)' obs_no = _n    // MARK ORIGINAL SORT ORDER
    
    gen new_variable = sum(multiple == 1) if !missing(multiple)
    by id (obs_no), sort:gen new_variable_2 = ///
        sum(missing(new_variable) | new_variable != new_variable[_n-1])
    In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Thank you so much! It`s really helpful!

      Kind regards,
      Svetlana

      Comment


      • #4
        Thank you so much! It`s really helpful! Thank you as well for the introduction of the -dataex- command!

        Kind regards,
        Svetlana

        Comment

        Working...
        X