Hello Everyone,
I have inherited several datasets which may or may not have overlapping entries. In general the variables are the same. One issue I am running into when trying to compare the datasets is that the number of entries is not the same. For example I might have the file blue_A with 300 entries and blue_B with 90 entries. I've checked that the variables are indeed the same (well 3 identifying var names are exactly the same, but the rest have A_ or B_ appended to them, which is a whole different issue) using cfvars (SSC). Before I append the datasets I want to check first if any of the id variable values are the same.
I guess what I am asking is, how do I determine if any of the 90 things (rooms, in this case) from the file blue_B are also included in the file blue_A? I have about 19 of these "pairs" of files that I have to go through. If the id variables was numeric I could make some plots or do some math to see if any entries are included in both, but they are identified by strings ("c1-b3-x6") so that doesn't work. Ideally these are separate datasets for case A and case B and the overlap is zero, but I know that is not the case (some rooms have been measured for case A and B).
I hope my question is clear....Nothing I've tried so far works and I feel like I've wasted a ton of time on something that should be simple.
Thanks in advance!
Cara
I have inherited several datasets which may or may not have overlapping entries. In general the variables are the same. One issue I am running into when trying to compare the datasets is that the number of entries is not the same. For example I might have the file blue_A with 300 entries and blue_B with 90 entries. I've checked that the variables are indeed the same (well 3 identifying var names are exactly the same, but the rest have A_ or B_ appended to them, which is a whole different issue) using cfvars (SSC). Before I append the datasets I want to check first if any of the id variable values are the same.
I guess what I am asking is, how do I determine if any of the 90 things (rooms, in this case) from the file blue_B are also included in the file blue_A? I have about 19 of these "pairs" of files that I have to go through. If the id variables was numeric I could make some plots or do some math to see if any entries are included in both, but they are identified by strings ("c1-b3-x6") so that doesn't work. Ideally these are separate datasets for case A and case B and the overlap is zero, but I know that is not the case (some rooms have been measured for case A and B).
I hope my question is clear....Nothing I've tried so far works and I feel like I've wasted a ton of time on something that should be simple.
Thanks in advance!
Cara
Comment