Merging long format or longitudinal datasets

anjana rajendra

Join Date: Apr 2021

Posts: 36
#1

Merging long format or longitudinal datasets

23 Apr 2021, 09:55

Dear All,

Trying to merge two longitudinal datasets is repeatedly giving following errors:
1) variable ptno does not uniquely identify observations in the using data
2) variable ptno does not uniquely identify observations in the master data

But I could merge these datasets without any hassles in SPSS. I need to do the same thing in the Stata. Please see the attached examples of datasets. Thank you
Attached Files
Tags: None
Ali Atia

Join Date: May 2020

Posts: 737
#2

23 Apr 2021, 10:48

It seems like all of the ids in your first dataset actually do uniquely identify observations, but they are duplicated for some reason. If that's true, you can use -duplicates drop- to drop duplicates, and then do a 1:m merge using the second dataset.

Also, please see the FAQ for advice about providing useful data examples.
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#3

23 Apr 2021, 10:55

You have two identical observations of patient number 1 in your top Excel screenshot, and three observations of patient number 1 in your bottom Excel screenshot. What do you expect the results of merging to be? Two observations? Three observations? Six observations?

If you expect three observations, how are the two observations in the top screenshot supposed to be matched to the three observations in the bottom screenshot?

In what you show, every observation in the top Excel screenshot appears twice. Is that true for all the data in that worksheet?

Finally, please take a few moments to review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question. It's particularly helpful to copy commands and output from your Stata Results window and paste them into your Statalist post using code delimiters [CODE] and [/CODE], and to use the dataex command to provide sample data, as described in section 12 of the FAQ.

Many readers will skip to the next post when they see that no usable sample data has been presented.
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2416
#4

23 Apr 2021, 11:01

(Crossed with both of the previous postings. Hope my comments are not too redundant.)

First, please take another look at the FAQ about how to post a data example using -dataex-,why to not post screenshots, and the need to show your actual code.

The fact that ptno does not distinctly identify observations in either data set suggests either a logical problem in your data set thinking and organization, or on the other hand that you have blank observations or other odd duplications that you don't know you have in the datasets.

Did you expect ptno to be a unique ID? If not, you should explain the logical relation between your two data sets. It may be in that case that -merge- is not what you want.

If you did think that in one or the other data sets, ptno *should* be unique, you need to investigate observations that are non-unique and understand the problem. To do that, the -duplicates- command would be useful, e.g. -duplicates examples ptno-.

The fact that SPSS did the merge might reflect many things, most likely some difference in the datasets that Stata sees and SPSS saw. This could occur if you imported the data sets to either program from a spreadsheet or similar file.

Last edited by Mike Lacy; 23 Apr 2021, 11:04.
Comment

Announcement

Merging long format or longitudinal datasets

Comment

Comment

Comment