Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combining rows of data based off an id, without losing information.

    I have a list of people with employee ids, names and roles:
    ID Name Role
    1 Jon Chef
    1 Jon Waiter
    2 Bill Waiter
    I would like to "collapse" these rows to the following:
    ID Name Chef Waiter
    1 Jon 1 1
    2 Bill 0 1
    I have an idea of how I might do this in Excel with Vlookup but my data set is too large for that. Any ideas how I could do it in Stata? Also, what would a process like this be called?
    Last edited by Daniel Ram; 03 Oct 2019, 10:45.

  • #2
    The following example may point you in a useful direction. The key is that rather than collapsing, Stata calls this reshaping your data; see the output of help reshape for a full description of both reshape wide and reshape long, and to get familiar with the help file, because if you're like me, each time you try reshape you'll need to review the syntax.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte id str4 name str6 role
    1 "Jon"  "Chef"  
    1 "Jon"  "Waiter"
    2 "Bill" "Waiter"
    end
    generate value = 1
    reshape wide value, i(id) j(role) string
    foreach v of varlist value* {
        replace `v' = 0 if `v'==.
        }
    rename (value*) (*)
    order id name
    list, clean noobs
    Code:
    . generate value = 1
    
    . reshape wide value, i(id) j(role) string
    (note: j = Chef Waiter)
    
    Data                               long   ->   wide
    -----------------------------------------------------------------------------
    Number of obs.                        3   ->       2
    Number of variables                   4   ->       4
    j variable (2 values)              role   ->   (dropped)
    xij variables:
                                      value   ->   valueChef valueWaiter
    -----------------------------------------------------------------------------
    
    . foreach v of varlist value* {
      2.     replace `v' = 0 if `v'==.
      3.         }
    (1 real change made)
    (0 real changes made)
    
    . rename (value*) (*)
    
    . order id name
    
    . list, clean noobs
    
        id   name   Chef   Waiter  
         1    Jon      1        1  
         2   Bill      0        1
    Last edited by William Lisowski; 03 Oct 2019, 11:19.

    Comment


    • #3
      In Stata rows are called observations.

      Comment


      • #4
        The following example may point you in a useful direction. The key is that rather than collapsing, Stata calls this reshaping your data; see the output of help reshape for a full description of both reshape wide and reshape long, and to get familiar with the help file, because if you're like me, each time you try reshape you'll need to review the syntax.
        Thank you William, this was very helpful!

        Comment

        Working...
        X