Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Vlookup style function of variable from other dataset

    Dear all,

    I come to you as I face a problem that sounds easy but I have stuck for good. I have two datasets to which I have a variable of a unique ID number to each (e.g. link, pair). I want to tell Stata to search if each ID number of the one variable (e.g. link) is present on the other variable (pair) and generate a dummy as an indicator.

    I have searched on the Internet and tried several ways but with no success.

    The last thing that maybe is important to know is that the datasets are large (191,000 and 26,000,000 obs).

    Any ideas would be highly appreciated.

    Thank you in advance.

  • #2
    One Stata-ish approach uses the -merge- command (see -help merge-) which requires that the id variable have the same name in both data sets.

    Code:
    use YourBigFile
    rename pair link  // same name for key variable in both files
    keep link
    duplicates drop link, force  
    merge 1:1 link using YourSmallFile, generate(foundit)
    tab foundit // see what you have
    drop if foundit == 1 // only in YourBigFile, not of interest
    recode foundit (3=1) (2=0)
    save NewYourSmallFile

    Comment


    • #3
      Thank you Mike! This works!

      Apologies for the delay to reply.

      Comment

      Working...
      X