Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Identify common variables across two datasets

    I have a simple question, but could not find a solution online. I have two datasets, which mostly contain different variables, but also have common variables (i.e. exactly the same variable name in both datasets). I am looking for an easy way to display - or better, save in a local - all variables that appear in both datasets.

    (I could do this manually, but since both datasets contain several hundred variables, that would be a very tedious and time-consuming task.)

    I'm looking forward to your comments!

  • #2
    Presumably you want to put the variable names in a local, not the variables themselves.

    Code:
    use dataset1 
    ds
    local in1 "`r(varlist)'" 
    use dataset2
    local in2 "`r(varlist)'"  
    local both : list in1 & in2 
    di "`both'"
    See also findname (SJ) which has more features than ds, and with which you could be a little more concise.

    SJ-15-2 dm0048_3 . . . . . . . . . . . . . . . . Software update for findname
    (help findname if installed) . . . . . . . . . . . . . . . N. J. Cox
    Q2/15 SJ 15(2):605--606
    updated to be able to find strL variables

    SJ-12-1 dm0048_2 . . . . . . . . . . . . . . . . Software update for findname
    (help findname if installed) . . . . . . . . . . . . . . . N. J. Cox
    Q1/12 SJ 12(1):167
    correction for handling embedded double quote characters

    SJ-10-4 dm0048_1 . . . . . . . . . . . . . . . . Software update for findname
    (help findname if installed) . . . . . . . . . . . . . . . N. J. Cox
    Q4/10 SJ 10(4):691
    update for not option

    SJ-10-2 dm0048 . . . . . . . . . . . . . . Speaking Stata: Finding variables
    (help findname if installed) . . . . . . . . . . . . . . . N. J. Cox
    Q2/10 SJ 10(2):281--296
    produces a list of variable names showing which variables
    have specific properties, such as being of string type, or
    having value labels attached, or having a date format

    Code:
    use dataset1 
    findname, local(in1) 
    use dataset2
    findname, local(in2) 
    local both : list in1 & in2 
    di "`both'"
    Or use unab:

    Code:
    use dataset1 
    unab in1 : * 
    use dataset2
    unab in2 : * 
    local both : list in1 & in2 
    di "`both'"

    Comment


    • #3
      Of course, I meant the variable names, sorry about that. Thank you for your quick and helpful reply!

      Comment


      • #4
        Sorry for replying old post. I have a similar issue, why list cannot be used if there are 3 datasets, like:

        Code:
        use dataset1  
        findname, local(in1)  
        use dataset2
        findname, local(in2)  
        use dataset3  
        findname, local(in3)
        local both : list in1 & in2 & in3
        di "`both'
        which will not run.
        Thank you.
        Last edited by Anqi Zhang; 12 May 2022, 19:47.

        Comment


        • #5
          That macro list function can take only two arguments, and so the line
          Code:
          local both : list in1 & in2 & in3
          is a syntax error.

          Intersection is associative; just break it up into two lines.

          Comment

          Working...
          X