Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How do I create a shortened dataset with a new identifier while carrying over values?

    I'm working with a dataset containing data on interventions performed on people. The data is organized by interventions, and the same often has multiple interventions. Each row contains an identifier for the intervention as well as a unique identifier for the person. I have created a series of dummies based on the values of the interventions variable. What I need is to create a new dataset organized around the persons containing the dummies that I have created but omitting the interventions.

    Here's a constructed example. The dataset in question has 5k+ interventions and 2k+ persons and more dummies.
    I have It should become
    Intervention Person Dummy 1 Dummy 2 Dummy 3 Person Dummy 1 Dummy 2 Dummy 3
    1 1 1 0 0 1 1 0 0
    2 2 0 1 1 2 1 1 1
    3 2 1 1 1 3 1 1 0
    4 3 1 0 0
    5 3 0 1 0
    6 3 0 1 0

    I can fairly easily create the new dataset with the person identifier. For example by tabulating the variable and then copy pasting. Probably also in some smarter way.


    I imagine that what I need is a command where if the dummy is set 1 it needs to find the person identifier in the other dataset and set the corresponding dummy value there to 1.


    Or from the dataset arranged and the person identifier it needs to look up the person identifier in the intervention-based dataset and set the dummy value to 1 if it's 1 in either of the interventions associated with the identifier.


    I apologize if this has been answered somewhere, but I cannot figure out the search terms to find it.

  • #2
    Code:
    clear
    input person dummy1 dummy2 dummy3
    1 1 0 0
    2 0 1 1
    2 1 1 1
    3 1 0 0
    3 0 1 0
    3 0 1 0
    end
    
    collapse (max) dummy*, by(person)
    list
    Results:
    Code:
         +-----------------------------------+
         | person   dummy1   dummy2   dummy3 |
         |-----------------------------------|
      1. |      1        1        0        0 |
      2. |      2        1        1        1 |
      3. |      3        1        1        0 |
         +-----------------------------------+
    * Also, please in future do users here a favor by using dataex to show data example that can be directly read into Stata. Read FAQ part 12 for details.

    Comment


    • #3
      Thank you so much!

      That litterally saved me days of work. Your solution is also so much more elegant than what I had in mind.

      I'll look into creating a datex example the next time.

      Comment

      Working...
      X