Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging two datasets with variables in one corresponding observations in the other

    Hi all. I ran into a challenge today and I hope to get some help.
    I want to merge 2 data sets. Dataset1 contains the member roster with 3 variables, the batch_id, member_num in the batch and occupation. Dataset2 member license statuses. The challenge here is that in dataset 2 the member_num is represented as a variable in a way that for member_num_x in dataset2 corresponds "x" observation under variable member_num in dataset1. I need to merge these 2 datasets so that in the end I have one dataset that has batch_id, member_num, occupation, and license status for each member. Thanks in advance.
    Attached Files

  • #2
    Welcome to Statalist!

    Perhaps this example using your data will start you in a useful direction.
    Code:
    clear all
    cls
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str3(batch_id member_num_1 member_num_2 member_num_3 member_num_4 member_num_5)
    "A01" "NA"  "NA"  "no"  "yes" "yes"
    "A02" "no"  "no"  "yes" ""    ""  
    "A03" "yes" "No"  "yes" ""    ""  
    "A04" "yes" "yes" "NA"  "NA"  "NA"
    "A05" "NA"  "yes" "yes" ""    ""  
    "A06" "NA"  "NA"  "no"  "NA"  "NA"
    "A07" "NA"  "yes" "NA"  "yes" "NA"
    "A08" "yes" ""    ""    ""    ""  
    "A10" "NA"  "NA"  "NA"  "no"  "NA"
    "A11" "yes" "yes" "NA"  "NA"  "no"
    "A12" "no"  "NA"  "no"  ""    ""  
    "A13" "NA"  "yes" "NA"  "NA"  "NA"
    "A14" "NA"  "yes" "yes" ""    ""  
    "A15" "NA"  "yes" "NA"  "NA"  "NA"
    "A16" "no"  "no"  "NA"  ""    ""  
    "A17" "yes" "NA"  ""    ""    ""  
    "A18" "yes" "yes" "yes" "yes" "yes"
    "A19" "yes" "yes" "yes" "yes" "yes"
    "A20" "yes" "yes" "NA"  "NA"  ""  
    end
    
    reshape long member_num_, i(batch_id) j(j)
    rename member_num_ status
    rename j member_num_
    save dataset2, replace
    
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str3 batch_id byte member_num_ str9 Occupation
    "A01" 1 "Driver"  
    "A01" 2 "Driver"  
    "A01" 3 "Navigator"
    "A01" 4 "Navigator"
    "A01" 5 "Driver"  
    "A02" 1 "Driver"  
    "A02" 2 "Driver"  
    "A02" 3 "Navigator"
    "A03" 1 "Cleaner"  
    "A03" 2 "Cleaner"  
    "A03" 3 "Navigator"
    "A04" 1 "Driver"  
    "A04" 2 "Navigator"
    "A04" 3 "Driver"  
    "A04" 4 "Driver"  
    "A04" 5 "Cleaner"  
    "A05" 1 "Cleaner"  
    "A05" 2 "Navigator"
    "A05" 3 "Cleaner"  
    "A06" 1 "Cleaner"  
    "A06" 2 "Driver"  
    "A06" 3 "Driver"  
    "A06" 4 "Navigator"
    "A06" 5 "Driver"  
    "A07" 1 "Driver"  
    "A07" 2 "Driver"  
    "A07" 3 "Navigator"
    "A07" 4 "Navigator"
    "A07" 5 "Navigator"
    "A08" 1 "Driver"  
    "A08" 1 "Navigator"
    "A08" 2 "Navigator"
    "A08" 3 "Driver"  
    "A08" 4 "Navigator"
    "A08" 5 "Navigator"
    "A10" 1 "Driver"  
    "A10" 2 "Driver"  
    "A10" 3 "Driver"  
    "A10" 4 "Driver"  
    "A10" 5 "Cleaner"  
    "A11" 1 "Cleaner"  
    "A11" 2 "Navigator"
    "A11" 3 "Cleaner"  
    "A11" 4 "Cleaner"  
    "A11" 5 "Cleaner"  
    "A12" 1 "Navigator"
    "A12" 2 "Cleaner"  
    "A12" 3 "Cleaner"  
    "A13" 1 "Navigator"
    "A13" 2 "Navigator"
    "A13" 3 "Driver"  
    "A13" 4 "Driver"  
    "A13" 5 "Driver"  
    "A14" 1 "Driver"  
    "A14" 2 "Cleaner"  
    "A14" 3 "Driver"  
    "A15" 1 "Cleaner"  
    "A15" 2 "Cleaner"  
    "A15" 3 "Navigator"
    "A15" 4 "Navigator"
    "A15" 5 "Navigator"
    "A16" 1 "Navigator"
    "A16" 2 "Navigator"
    "A16" 3 "Driver"  
    "A17" 1 "Navigator"
    "A17" 2 "Navigator"
    "A18" 1 "Driver"  
    "A18" 2 "Driver"  
    "A18" 3 "Driver"  
    "A18" 4 "Navigator"
    "A18" 5 "Driver"  
    "A19" 1 "Navigator"
    "A19" 2 "Driver"  
    "A19" 3 "Navigator"
    "A19" 4 "Navigator"
    "A19" 5 "Driver"  
    "A20" 1 "Driver"  
    "A20" 2 "Driver"  
    "A20" 3 "Driver"  
    "A20" 4 "Driver"  
    end
    
    merge m:1 batch_id member_num_ using dataset2
    
    list if batch_id=="A08", clean noobs abbreviate(12)
    Code:
    . list if batch_id=="A08", clean noobs abbreviate(12)
    
        batch_id   member_num_   Occupation   status        _merge  
             A08             1       Driver      yes   Matched (3)  
             A08             1    Navigator      yes   Matched (3)  
             A08             2    Navigator            Matched (3)  
             A08             3       Driver            Matched (3)  
             A08             4    Navigator            Matched (3)  
             A08             5    Navigator            Matched (3)
    That was a fairly straightforward question which likely would have been answered quickly had you presented your example data using the dataex command, so that those interested in helping did not have to do more than read your post to see if they could help. There are a lot of questions to answer, and members tend to make quick judgements on where to focus their efforts. Asking the reader to download two datasets and launch Stata and read in the datasets and understand what how they are organized is perhaps a bigger ask than many readers will respond to.

    So please take a few moments to review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question. It is particularly helpful to copy commands and output from your Stata Results window and paste them into your Statalist post using code delimiters [CODE] and [/CODE], and to use the dataex command to provide sample data, as described in section 12 of the FAQ.

    The more you help others understand your problem, the more likely others are to be able to help you solve your problem.

    Comment


    • #3
      Hi. Thanks a bunch. This helped very much. Regards

      Comment

      Working...
      X